mrc-ide / naomi

Naomi model for subnational HIV estimation
https://mrc-ide.github.io/naomi
Other
9 stars 9 forks source link

Speed up ART time series generation #427

Closed r-ash closed 7 months ago

r-ash commented 7 months ago

This PR will speed up generation of ART time series.

We've seen with current Malawi data that this is taking up to 25s to return the data

image

Running a profile we can see that a lot of this is coming from the data prep, with the json serialization not being insignificant

image

This PR does a couple of things

I've checked this with the current Malawi data and it produces the same result, in a slightly different order. But we should check this carefully and make sure it will produce the same output for other countries too. Hopefully the tests will check the edge cases for us here.

This speeds up aggregate_art quite a lot, before this was taking ~12s on my machine, after this change taking ~850ms so a decent speedup!

I think worth having a closer look at this at some point but hopefully this should help for now.

If you want to run a profile yourself profvis is a good way to go

profvis::profvis(aggregate_art(art, shape), simplify = FALSE)
r-ash commented 7 months ago

Great tests here!

So my changes collapse the separate handling of calendar_quarter but we need this because the max area_level might be different for different quarters.

This also explains why it is so slow for new Malawi data, we are working with many quarters there and it is computing it for each slice.

So if we want to make this quicker it is going to require a bit of a bigger change. I'm going to leave this open for now

r-ash commented 7 months ago

Actually given this another go now, instead of running each calendar quarter and then binding those rows together, it splits on finest area level. Meaning that if the country has the same finest area level for all calendar quarters it will process these together. Meaning hopefully Malawi runs in reasonable time