Batch of runs submitted from PR #1308 before/after review&master merge inconsistent

marghe-molaro commented 5 months ago

PR #1308 allows user to rescale HR capabilities to capture effective capabilities when transitioning to mode 2.

Two batch of runs where submitted to Azure from this PR, effect_of_policy-2024-04-09T132919Z (2 runs per draw) and effect_of_capabilities_scaling-2024-04-29T074640Z (10 runs per draw). While the rescaling seems to be occurring in both sets of runs, the evolution of DALYs recovered is much different when all other factors (cons. availability, HR used, etc...) are identical, as shown in plots below. Scaling of capabilities

The two branches from which these jobs were submitted produce identical outputs to the test test_rescaling_capabilities_based_on_squeeze_factors, suggesting difference does not arise from stylistic changes in how rescaling is computed in more recent branch suggested by @matt-graham. While the more recent branch includes checks on refactoring not being infinite, which would have occurred in the case of some mental health care workers, this should not be affecting DALYs related to HIV/AIDS shown here which do not rely on this medical worker class.

Other differences between branches include changes brought over by master including:

Refactoring of generic first appointments
Transition to python 3.11
new scenario switcher class

@matt-graham any suggestions on things to double check would be very appreciated!

matt-graham commented 5 months ago

Thank you @matt-graham! I was hoping you might be able to help me clarify something while @willGraham01 is away: in one of his previous replies he said that:

The failure cannot be because one of the modules in the main loop is editing any shared dataframe properties, since lines 164-180 occur after the dataframe update from the main loop is applied, but still use the "cached" values that the main loop utilised.

My understanding though is that prior @willGraham01's revision, modules were not only using shared dataframe properties during _do_on_generic_first_appt, but also modifying them, hence the need for this loop to exist at the end of the function:
        if proposed_patient_details_updates:
            df.loc[
                self.target, proposed_patient_details_updates.keys()
            ] = proposed_patient_details_updates.values()
Moving to the use of "cached" values would therefore necessarily lead to differences, as modules would now be accessing the individual's original parameters, rather than those updated by the modules called before inside _do_on_generic_first_appt. Am I missing something?

Hi @marghe-molaro, sorry for the delay replying. You are completely correct here and this is something I missed when reviewing #1292. The intention had been for the previous model behaviour to be retained (beyond possibly calling the module specific logic in a different order as we hadn't envisaged this would lead to changes in distributional output even if specific realisations were different). We should be updating the data structure used to mediate access to the individual properties within the loop when a module makes an update rather than always using the initially read cached values. The simplest way to restore the previous behaviour would be to move the code for updating the dataframe if there are updates inside the loop and also recreate the patient_details structure when doing this, that is something like

        patient_details = self.sim.population.row_in_readonly_form(self.target)
        for module in modules.values():
            module_patient_updates = getattr(module, self.MODULE_METHOD_ON_APPLY)(
                patient_id=self.target,
                patient_details=patient_details,
                symptoms=symptoms,
                diagnosis_function=self._diagnosis_function,
                consumables_checker=self.get_consumables,
                facility_level=self.ACCEPTED_FACILITY_LEVEL,
                treatment_id=self.TREATMENT_ID,
                random_state=self.module.rng,
            )
            if module_patient_updates:
                self.sim.population.props.loc[
                    self.target, module_patient_updates.keys()
                ] = module_patient_updates.values()
                patient_details = self.sim.population.row_in_readonly_form(self.target)

Combined with the changes Will made in #1366 to fix the module ordering etc. this should give us identical behaviour to previously.

Implementing exactly as above will however mean we may lose some of the performance benefit we were aiming for on #1292 as we'd now be updating the dataframe multiple times again. I think we can avoid this with a slightly more involved fix which will mean updating the PatientDetails data structure to allow both reading and writing (but only synchronizing the writes back to the dataframe when explicitly requested to allow us to still do this once). I think I can probably get a PR for this more involved submitted by tomorrow. After discussing with @tamuri what we're proposing is therefore that I'll aim to get a PR submitted to make this fix ASAP with a deadline of tomorrow midday. If for some reason this turns out to be more complicated than I expected we'll fall back to using the simpler (less performant) fix above, so either way we should have something ready to restore previous behaviour by tomorrow afternoon. Does that sound reasonable @marghe-molaro?

In the mean time I've created a new branch mmg/hsi-event-individual-updates-quickfix off from the branch in #1366 with the quick fix above for ensuring any module updates to the individual properties are reflected in the cached values immediately. You could potentially create a new branch from your analysis branch and merge in that branch and trigger a new Azure run to check if that does completely restore behaviour (or potentially run with a different module ordering to check if this is the origin of strong module ordering dependence).

It may be a moot point, but I also noticed that this section, will lead to the updating not working as expected if more than module is trying to update the same property. (The updates from the last module will "win".)

https://github.com/UCL/TLOmodel/blob/6494277376ad3c66d27ddb547a6a61ac3f3012d1/src/tlo/methods/hsi_generic_first_appts.py#L93-L97

Thanks for also flagging this @tbhallett. In this case I think (beyond the issue with updates not being synced back to the cached values in patient_details dicussed above) this would work equivalently to the previous behaviour where if multiple modules update a particular property only the update from the last module would be reflected in the final population dataframe. This does mean there would be some module ordering dependence but we in this case would at least still be consistent with previous behaviour if using the same module ordering.

tbhallett commented 5 months ago

Thanks so much for this @matt-graham, @tamuri and @marghe-molaro

It's great that it seems like we'll be able to restore the original behaviour of the model and retain the performance benefits of the refactoring.

I have only one remaining question:

I was surprised that apparently the ordering of the modules being called in hsi_generic_first_appts.py seemed to have such a big effect. I would not have expected this. So, when we have the original behaviour of the model, I would keen to understand if that is right, and the cause of it. When we have the fix, please could we possibly discuss/investigate that too?

matt-graham commented 5 months ago

I have only one remaining question:

I was surprised that apparently the ordering of the modules being called in hsi_generic_first_appts.py seemed to have such a big effect. I would not have expected this. So, when we have the original behaviour of the model, I would keen to understand if that is right, and the cause of it. When we have the fix, please could we possibly discuss/investigate that too?

Agreed this is definitely something we should investigate - will create an issue to remind us to do this.

marghe-molaro commented 5 months ago

Hi @matt-graham @willGraham01 @tamuri @tbhallett,

Here are the preliminary results from tests on the sensitivity of the output to module order, based on 3 runs per draw with a pop of 100,000. The cases I considered are as follows:

Original ordering;
CMD & HIV inverted, as introduced by PR #1292;
Deliberate reordering: RTI first, HTM moved to the end and inverted in order
Random shuffle of modules x 3 I ran the sim for 6 years, the first year in mode 1 and the subsequent 5 in mode 2.

At this link I include the time evolution of total DALYs and of all causes. I am still sorting through them, but my first impression is that things broadly speaking look reasonable, as we wouldn't expect the runs to be identical in any case. But I wonder if some causes (esp. TB and depression, which are quite common) are indicative that a population of 100,000 may be a bit small. Let me know what you think.

tbhallett commented 5 months ago

Thanks very much @marghe-molaro

I think this is very convincing. We don't expect the order to make a difference and we see no evidence of it having an effect. I think we can be reassured by this and move on!

matt-graham commented 5 months ago

Thanks @marghe-molaro. Agree with @tbhallett, this looks like strong evidence module order not having an effect as expected. I'll link to your comment in the issue we raised to check about this and close it.

marghe-molaro commented 5 months ago

Thank you @matt-graham. I will also close this issue, as I believe we have now established the causes of the original discrepancy, which has been fixed by PR #1389.

UCL / TLOmodel

Batch of runs submitted from PR #1308 before/after review&master merge inconsistent #1348