Closed r-ash closed 7 months ago
Great tests here!
So my changes collapse the separate handling of calendar_quarter
but we need this because the max area_level might be different for different quarters.
This also explains why it is so slow for new Malawi data, we are working with many quarters there and it is computing it for each slice.
So if we want to make this quicker it is going to require a bit of a bigger change. I'm going to leave this open for now
Actually given this another go now, instead of running each calendar quarter and then binding those rows together, it splits on finest area level. Meaning that if the country has the same finest area level for all calendar quarters it will process these together. Meaning hopefully Malawi runs in reasonable time
This PR will speed up generation of ART time series.
We've seen with current Malawi data that this is taking up to 25s to return the data
Running a profile we can see that a lot of this is coming from the data prep, with the json serialization not being insignificant
This PR does a couple of things
|>
pipe instead of%>%
, we should in general switch over to this as%>%
does incur a small overhead whereas|>
does notI've checked this with the current Malawi data and it produces the same result, in a slightly different order. But we should check this carefully and make sure it will produce the same output for other countries too. Hopefully the tests will check the edge cases for us here.
This speeds up
aggregate_art
quite a lot, before this was taking ~12s on my machine, after this change taking ~850ms so a decent speedup!I think worth having a closer look at this at some point but hopefully this should help for now.
If you want to run a profile yourself
profvis
is a good way to go