willGraham01 commented 4 months ago

We are currently spending a lot of time within the SymptomManager.has_what method (called inside the do_at_generic_first_appt methods) - approx 8% of simulation time according to the most recent profiling result.

SnakeViz places the blame for this on calls to pandas methods within SymptomManager, because it does not make use of the IndividualDetails context that is setup during the generic_first_appt method.

master_snakeviz

This PR introduces the following changes:

SymptomManager.has_what can now be passed an IndividualProperties instance, which it will use instead of looking up values directly from the population DataFrame.
This same method now only takes keyword arguments, and must be supplied either the person_id or individual_details inputs. The individual_details argument takes priority over the person_id method if both are supplied (since it is at worst the same speed as looking up from the DF each time).
All calls to SymptomManager.has_what in the codebase now respect the keyword arguments the function takes.
test_has_what has been updated to check that the new method of retrieving symptoms via the individual_details keyword performs as expected.

With these changes, the time spent within has_what has almost halved:

exp_snakeviz

with us now spending 174s inside has_what and all subfunctions, rather than 331s that we were spending before. Experiments were run using the scale_run simulation over 2 years and using a relatively small initial population of 10,000, so have triggered a profiling run here to get some more accurate figures.

More recent profiling runs triggered with the run-profiling workflow indicate that we are saving approx ~30mins with these changes (out of a ~7.5hr simulation, so approx 6% time cut off).

There shouldn't be any other functionality changes introduced in this PR but please flag any concerns if you spot anything!

willGraham01 commented 4 months ago

/run profiling

ucl-comment-bot[bot] commented 4 months ago

Profiled run of the model succeeded ✅

🆔 27496352057 ⏲️ 502 minutes

️⃣ 6dc298ea0497735453f60d0e8fe75f41ab306b5e

willGraham01 commented 4 months ago

Profiled run succeeded but we hit the expired token problem. Rerunning now

willGraham01 commented 4 months ago

/run profiling

matt-graham commented 4 months ago

Just to say in case it's helpful for checking things in the interim, the artifact from the previous profiling run is still available to download even though it wasn't uploaded to profiling website https://github.com/UCL/TLOmodel/actions/runs/9953255205/artifacts/1707268804

willGraham01 commented 4 months ago

Just to say in case it's helpful for checking things in the interim, the artifact from the previous profiling run is still available to download even though it wasn't uploaded to profiling website https://github.com/UCL/TLOmodel/actions/runs/9953255205/artifacts/1707268804

Thanks - it looks like the SymptomManager footprint inside the generic first appts method has indeed shrunk considerably (at least according to the artifact):

According to the profiling result, we have gone from spending ~40 mins within the SymptomManager.has_what function to ~10 mins, so are saving 30 mins per simulation.

ucl-comment-bot[bot] commented 4 months ago

Profiled run of the model succeeded ✅

🆔 27548769167 ⏲️ 458 minutes

️⃣ 625b77c1ca12e34236d95d3a540e64895721d053

UCL / TLOmodel

Use `IndividualDetails` context to speed up `SymptomManager.has_what` #1423

Profiled run of the model succeeded ✅

️⃣ 6dc298ea0497735453f60d0e8fe75f41ab306b5e

Profiled run of the model succeeded ✅

️⃣ 625b77c1ca12e34236d95d3a540e64895721d053