Open tamuri opened 4 months ago
This might be to do with the cross-class function calls in the codebase, and how the function which sets off the population level events is called?
HealthSystemScheduler.apply
calls HealthSystemScheduler.process_events_mode_2
.HealthSystemScheduler.process_events_mode_2
calls HealthSystem.run_population_level_events
. This is another case where we have a circular reference between the scheduler and and HealthSystem
instance.HealthSystem.run_population_level_events
then calls run
on each of the population level events due on the given day.So the population level events are being run by HealthSystem.run_population_level_events
, but via a method which is only ever called via HealthSystemScheduler.module
(which points to the HealthSystem
) within HealthSystemScheduler.process_events_mode_2
.
So I guess PyInstrumment is technically right that the pop-level events are run within the scheduler class, though I'd now expect the population-level events to be nested within HealthSystemScheduler.process_events_mode_2
if I'm following the call stack, rather than one level higher :sweat_smile:
I don't think that's right. For example, the event HealthSeekingBehaviourPoll is not an HSI event, and is not run by the HealthSystem. It is processed in the main Simulation event loop, so I'd expect it to be under Simulation.fire_single_event.
Compare with cprofile dump:
I've looked in to this a bit and think I've identified what is going on. The key issue seems to be that the class name used in the identifier for a frame corresponding to a method call will be written out as a particular subclass (specifically the mostly commonly observed subclass in the captured frames) even if the method being called is defined on a base class.
As a simple minimal reproducer, consider the following script corresponding to a very simplified version of the TLOmodel
simulation call stack:
import time
def simulate():
time.sleep(0.1)
events = [EventA(), EventB(), EventA()] * 10
for event in events:
event.run()
class Event:
def run(self):
self.apply()
class EventA(Event):
def apply(self):
time.sleep(0.1)
class EventB(Event):
def apply(self):
time.sleep(0.1)
if __name__ == "__main__":
simulate()
Running pyinstrument
on this script (saved as pyinstrument_test.py
) gives the following output using the console renderer:
_ ._ __/__ _ _ _ _ _/_ Recorded: 15:46:49 Samples: 31
/_//_/// /_\ / //_// / //_'/ // Duration: 3.110 CPU time: 0.005
/ _/ v4.7.3
Program: pyinstrument_test.py
3.110 <module> pyinstrument_test.py:1
└─ 3.110 simulate pyinstrument_test.py:4
├─ 3.009 EventA.run pyinstrument_test.py:13
│ ├─ 2.006 EventA.apply pyinstrument_test.py:19
│ │ └─ 2.006 sleep <built-in>
│ └─ 1.003 EventB.apply pyinstrument_test.py:25
│ └─ 1.003 sleep <built-in>
└─ 0.101 sleep <built-in>
Here we see that underneath the top-level simulate
call, all event apply
method calls are grouped under a frame with identifier EventA.run
. This corresponds in reality to the run
method defined on the base Event
class, which we can see from the correct line number reference in to the script. However as the concrete classes from which the run
method is called are EventA
and EventB
, when recording these frames pyinstrument
appears to record both these as the class associated with the method call and then resolve the ambiguity when rendering the output, just outputs the mosts commonly observed class name:
This seems like a bug to me or at least is a bit misleading so I'll raise an issue on pyinstrument
repository and see if we can get fixed.
Importantly it does seem however the overall call tree recorded is correct and if we just look at the file / line number references in the existing output we can correctly identify whether methods are actually defined.
Accidentally hit close while writing comment 🤦
Not sure whether this can be fixed or corrected at all - probably not - but I noticed that the indentation of samples in the pyinstrument output is incorrect. Regular events are being placed as if they are being run by the health system scheduler. Any ideas?