mrc-ide / malariasimulation

The malaria model
https://mrc-ide.github.io/malariasimulation
Other
16 stars 14 forks source link

unclear stochastic behaviour around when interventions are being turned on #284

Open hannahslater opened 8 months ago

hannahslater commented 8 months ago

We were interested if there was different behaviour between having an intervention turned off compared to having it on at zero coverage. If you don’t set.seed, you get two runs that are different the whole way through (as you’d expect, and we were assuming this is the normal stochastic variation between runs)

But then if you set.seed before each run (and assuming the same seed), that’s when it gets a bit weird…(code attached is extended from your example on th

this refers to the second graph that the code below makes Black line is the simplest model run with no interventions Red line is LLINs turned ON but coverage = 0, but bednetstimesteps <- c(1, 4) Blue line is LLINs turned on, coverage = c(0, 0.4) and bednetstimesteps <- c(1, 4)

So you see that the black and red lines (which should be the same if they have the same set.seed?), but they start to diverge at t=365, the first LLIN timestep And then the blue and red lines (both LLIN on) are overlapping until the second LLIN timestep

So is something happening where the seed is changing after the interventions are being turned on?

# options(repos = c(
#   mrcide = 'https://mrc-ide.r-universe.dev',
#   CRAN = 'https://cloud.r-project.org'))
# 
# install.packages('malariasimulation')
library(malariasimulation)

#### example of different stochastic behaviour between similar runs 
### and impact of defining set.seed before each sim

year <- 365
sim_length <- 6 * year
human_population <- 5000
starting_EIR <- 50

simparams <- get_parameters(
  list(human_population = human_population)
)

simparams <- set_equilibrium(parameters = simparams, init_EIR = starting_EIR)

bednetstimesteps <- c(1, 4) * year # The bed nets will be distributed at the end of the first and the 4th year.

# BEDNETS ON BUT COVERAGE SET TO ZERO
bednetparams <- set_bednets(
  simparams,
  timesteps = bednetstimesteps,
  coverages = c(0, 0),  # Each round is distributed to 50% of the population.
  retention = 5 * year, # Nets are kept on average 5 years
  dn0 = matrix(c(.533, .533), nrow = 2, ncol = 1), # Matrix of death probabilities for each mosquito species over time
  rn = matrix(c(.56, .56), nrow = 2, ncol = 1), # Matrix of repelling probabilities for each mosquito species over time
  rnm = matrix(c(.24, .24), nrow = 2, ncol = 1), # Matrix of minimum repelling probabilities for each mosquito species over time
  gamman = rep(2.64 * 365, 2) # Vector of bed net half-lives for each distribution timestep
)

# BEDNETS ON AND COVERAGE SET TO 0.4 AT TIME=4 YEARS
bednetparams2 <- set_bednets(
  simparams,
  timesteps = bednetstimesteps,
  coverages = c(0, 0.4),  # Each round is distributed to 50% of the population.
  retention = 5 * year, # Nets are kept on average 5 years
  dn0 = matrix(c(.533, .533), nrow = 2, ncol = 1), # Matrix of death probabilities for each mosquito species over time
  rn = matrix(c(.56, .56), nrow = 2, ncol = 1), # Matrix of repelling probabilities for each mosquito species over time
  rnm = matrix(c(.24, .24), nrow = 2, ncol = 1), # Matrix of minimum repelling probabilities for each mosquito species over time
  gamman = rep(2.64 * 365, 2) # Vector of bed net half-lives for each distribution timestep
)

# EXAMPLE 1 - NO SET SEED
ex1_output_control <- run_simulation(timesteps = sim_length, parameters = simparams)
ex1_output <- run_simulation(timesteps = sim_length, parameters = bednetparams)
ex1_output2 <- run_simulation(timesteps = sim_length, parameters = bednetparams2)

plot(ex1_output_control$timestep, ex1_output_control$n_detect_730_3650/ex1_output_control$n_730_3650, type="l", ylim = c(0.2, 1),
     xlab = "Day", ylab = "Prevalence in 2-10s", las=1, main = "No set seed")
lines(ex1_output$timestep, ex1_output$n_detect_730_3650/ex1_output_control$n_730_3650, col = 2)
lines(ex1_output2$timestep, ex1_output2$n_detect_730_3650/ex1_output_control$n_730_3650, col = 4)
legend("bottomleft", c("nets OFF", "nets ON coverage = 0%", "nets on coverage = 40% at 4yrs"),
       col = c(1,2,4), lty=1)
text(1000, 0.9, "all runs different at all time points")

# EXAMPLE 2 - SET SEED BEFORE EACH SIM
set.seed(12345)
ex2_output_control <- run_simulation(timesteps = sim_length, parameters = simparams)
set.seed(12345)
ex2_output <- run_simulation(timesteps = sim_length, parameters = bednetparams)
set.seed(12345)
ex2_output2 <- run_simulation(timesteps = sim_length, parameters = bednetparams2)

plot(ex2_output_control$timestep, ex2_output_control$n_detect_730_3650/ex2_output_control$n_730_3650, type="l", ylim = c(0.2, 1),
     xlab = "Day", ylab = "Prevalence in 2-10s", las=1, main = "Set seed before each run")
lines(ex2_output$timestep, ex2_output$n_detect_730_3650/ex2_output_control$n_730_3650, col = 2)
lines(ex2_output2$timestep, ex2_output2$n_detect_730_3650/ex2_output_control$n_730_3650, col = 4)
legend("bottomleft", c("nets OFF", "nets ON coverage = 0%", "nets on coverage = 40% at 4yrs"),
       col = c(1,2,4), lty=1)
arrows(365, 0.85, 365, 0.75, length = 0.1)
text(365, 0.95, "All runs identical\nuntil t=365 (first\nbednet timepoint)")
arrows(365*4, 0.85, 365*4, 0.75, length = 0.1)
text(365*4, 0.95, "Red and blue lines also\nindentical until 4yrs\n(time of 2nd bedent timepoint)")

[edited code formatting]

EllieSherrardSmith commented 8 months ago

To add two quick thoughts:

The rn and dn0 parameters used should sum to 1 or less, so I retried the script here with

dn0 = matrix(c(.43, .43), nrow = 2, ncol = 1), # Matrix of death probabilities for each mosquito species over time rn = matrix(c(.56, .56), nrow = 2, ncol = 1), # Matrix of repelling probabilities for each mosquito species over time

I also got a closer match bewteen the red and black line when i set enable_heterogeneity to FALSE in case that helps identify issues

simparams <- get_parameters( list(human_population = human_population, enable_heterogeneity = FALSE) )

pwinskill commented 8 months ago

Looks to me like sample_intervention() is called here even when a bednet timestep is associated with 0 coverage, which would throw the RNG state off. First try would be to add an if statement here to skip that call if coverage at that timepoint == 0 https://github.com/mrc-ide/malariasimulation/blob/397a7b7efe90958dd01f97110a1d16c71d041f33/R/vector_control.R#L137-L142