Closed willGraham01 closed 3 months ago
/run profiling
🆔 27496352057 ⏲️ 502 minutes
Profiled run succeeded but we hit the expired token problem. Rerunning now
/run profiling
Just to say in case it's helpful for checking things in the interim, the artifact from the previous profiling run is still available to download even though it wasn't uploaded to profiling website https://github.com/UCL/TLOmodel/actions/runs/9953255205/artifacts/1707268804
Just to say in case it's helpful for checking things in the interim, the artifact from the previous profiling run is still available to download even though it wasn't uploaded to profiling website https://github.com/UCL/TLOmodel/actions/runs/9953255205/artifacts/1707268804
Thanks - it looks like the SymptomManager
footprint inside the generic first appts method has indeed shrunk considerably (at least according to the artifact):
According to the profiling result, we have gone from spending ~40 mins within the SymptomManager.has_what
function to ~10 mins, so are saving 30 mins per simulation.
🆔 27548769167 ⏲️ 458 minutes
We are currently spending a lot of time within the
SymptomManager.has_what
method (called inside thedo_at_generic_first_appt
methods) - approx 8% of simulation time according to the most recent profiling result.SnakeViz places the blame for this on calls to pandas methods within
SymptomManager
, because it does not make use of theIndividualDetails
context that is setup during thegeneric_first_appt
method.This PR introduces the following changes:
SymptomManager.has_what
can now be passed anIndividualProperties
instance, which it will use instead of looking up values directly from the population DataFrame.person_id
orindividual_details
inputs. Theindividual_details
argument takes priority over theperson_id
method if both are supplied (since it is at worst the same speed as looking up from the DF each time).SymptomManager.has_what
in the codebase now respect the keyword arguments the function takes.test_has_what
has been updated to check that the new method of retrieving symptoms via theindividual_details
keyword performs as expected.With these changes, the time spent within
has_what
has almost halved:with us now spending 174s inside
has_what
and all subfunctions, rather than331s
that we were spending before. Experiments were run using thescale_run
simulation over 2 years and using a relatively small initial population of 10,000, so have triggered a profiling run here to get some more accurate figures.More recent profiling runs triggered with the run-profiling workflow indicate that we are saving approx ~30mins with these changes (out of a ~7.5hr simulation, so approx 6% time cut off).
There shouldn't be any other functionality changes introduced in this PR but please flag any concerns if you spot anything!