mrc-ide / individual

R Package for individual based epidemiological models
https://mrc-ide.github.io/individual
Other
30 stars 16 forks source link

Add a restore flag to the Event constructor. #180

Closed plietar closed 8 months ago

plietar commented 9 months ago

By default, when restoring the simulation state all previous schedules on events are cleared and restored from the saved state. This goes against use cases that wish to resume a simulation with different intervention schedules to compare effects. In those use cases, a different initialization sequence is used when creating the simulation, and we do not want that to be cleared and overwritten.

The new restore flag, when set to false, overrides this default behaviour and the state of an Event is (mostly) unaffected by a restore. Thanks to this, a new event schedule, that is unrelated to the schedule of the original run, can be configured.


Unlike other event types, which have a schedule method, the StaticEvent follows a fixed list of times at which it triggers, defined at initialization. This follows a common pattern found in malariasimulation of using a vector of times at which an intervention happens, where the vector is part of the simulation parameters.

When triggering, listeners on a static event receive as an additional argument the index into the timestep list matching the current timestep. This can be used to look up parameters associated with the current invocation.

Below is an example of a StaticEvent being used to model a mass drug administration campaign:

treated <- CategoricalVariable$new(c('Y','N'), rep('N', 100))
mda_times <- c(20, 40, 60)
mda_coverage <- c(0.1, 0.1, 0.2)
mda_event <- StaticEvent$new(mda_times)
mda_event$add_listener(function(timestep, index) {
  coverage <- mda_coverage[[index]]
  treated$queue_update('Y',
    treated$get_index_of('N')$sample(coverage))
})

The main benefit of StaticEvent is to enable a schedule to be reliably modified when resuming a simulation. The Event and TargetedEvent classes save their schedules in the checkpoint and restore them when loading from a previous simulation state. As a consequence of this, resuming a simulation with a different schedule for an event may not work as intended, even if the scheduled time is in the future.

For example in the code below, the simulation is first initialized with an event scheduled at t=10 and is run for only 5 steps, hence the event does not yet trigger. On the second run, the event is seemingly scheduled for t=15. However, by resuming the simulation, the event's schedule is overwritten with the saved state, clearing the newly scheduled time. The event is triggered at t=10 only.

e <- Event$new()
e$schedule(9)
state <- simulation_loop(timesteps = 5, events = list(e))

e <- Event$new()
e$schedule(14)
e$add_listener(function(t) cat("Triggered at timestep", t))
simulation_loop(timesteps = 30, events = list(e), state = state)
#> Triggered at timestep 10

StaticEvent provides alternative semantics, allowing their schedule to be modified reliably when resuming the simulation. Static events don't save or restore their schedule, instead it is only dependent on their initialization, which can be modified when resuming. In the modified example below, the event triggers at t=15 as intended.

e <- StaticEvent$new(10)
state <- simulation_loop(timesteps = 5, events = list(e))

e <- StaticEvent$new(15)
e$add_listener(function(t, index) cat("Triggered at timestep", t))
simulation_loop(timesteps = 30, events = list(e), state = state)
#> Triggered at timestep 15

~~

codecov[bot] commented 9 months ago

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (c9cdee3) 96.28% compared to head (9e79ea3) 96.27%. Report is 13 commits behind head on dev.

Files Patch % Lines
src/event.cpp 92.50% 3 Missing :warning:
R/simulation.R 96.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## dev #180 +/- ## ========================================== - Coverage 96.28% 96.27% -0.02% ========================================== Files 36 36 Lines 1722 1824 +102 ========================================== + Hits 1658 1756 +98 - Misses 64 68 +4 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

plietar commented 8 months ago

So I've thought a bit more about this. I don't think I like a global restore strategy as an argument on simulation_loop, since the simulation could need a mix of events with different semantics depending on the use case.

I've implemented instead as a per-Event strategy, eg.

event <- Event$new(restore = FALSE)
event$schedule(14)

The restore flag defaults to TRUE if not specified. When set to FALSE, the restore code still needs to restore the current timestep, but nothing else. It does not clear the schedule and it does not insert timepoints from the saved state.

I guess at some point we can add mixed restore strategies, such as restore = "merge", if we ever have a need for it.


The one caveat is that the way this is typically used in malariasimulation will need to be adjusted a little bit. In malariasimulation, a common pattern is to have a sequence of timesteps coming in as parameters, scheduling the first one of the list during initialization, and then each time the listener is fired it reschedules the next one.

The code looks something like:

e <- Event$new()
e$schedule(times[[1]] - 1)
e$add_listener(function(t) {
    index <- which(times == t)
    e$schedule(times[[index + 1]])
})

If you resume the simulation at a point later than the first timepoint then the scheduled execution is ignored (it is in the past), and the subsequent executions we expect are never scheduled. For this to work, the initialization would have to know at which timestep the simulation is resumed and use the first value that is greater (or equal) than that.

A simpler and probably better way of expressing this by scheduling all the time points upfront, as follows:

e <- Event$new(restore = FALSE)
e$schedule(times - 1)
e$add_listener(function(t) {
    index <- which(times == t)
    # Don't call e$schedule here
})

Anything in times that is before the resume point is ignored, but anything afterwards works as expected. This also simplifies the code a little bit IMO.

plietar commented 8 months ago

See https://github.com/mrc-ide/malariasimulation/commit/1d2c662ffb4bbcefdde083b2136221a0040abbac for how this can be used in malariasimulation