setting completely correlated interventions does not result in the expected coverage

htopazian commented 4 months ago

When setting the inter_round_rho to 1 between two interventions, the model does not assign the same individuals to get interventions. For example, with 50% ITNs across the whole population and 50% SMC among children and perfect correlation, I would expect all of those children receiving SMC to also get a bednet. But the level is <100%, the model seems to prioritize more nets to the children with SMC, but not at the level set by the correlation.

This test-case uses the feat/correlation_outputs branch, which has added in model output variables for the number of individuals receiving combined interventions.


library(tidyverse)

run_correlation <- function(int1, int2){

  # test with malariasimulation

  year <- 365
  month <- year / 12
  warmup <- 0 * year 
  runtime <- 10 * year
  human_population <- 1000 

  params <- get_parameters(list(
    human_population =  human_population,
    model_seasonality = FALSE,
    individual_mosquitoes = FALSE))

  # treatment ----------
  params <- set_drugs(
    parameters = params,
    list(AL_params, SP_AQ_params))

  # AL default, SP: https://doi.org/10.1016/S2214-109X(22)00416-8 supplement
  params$drug_prophylaxis_scale <- c(10.6, 39.34) 
  params$drug_prophylaxis_shape <- c(11.3, 3.40) 

  params <- set_clinical_treatment(
    parameters = params,
    drug = 1,
    timesteps = c(1),
    coverages = c(0.45)
  ) 

  # ITNs ----------
  # find values in S.I. of 10.1038/s41467-018-07357-w
  # or in Table S1.3 of https://www.sciencedirect.com/science/article/pii/S2542519621002965#sec1
  if(int1 == "bednets" | int2 == "bednets"){
    bednet_timesteps <- c(seq(1, runtime, 3 * year))

    params <- set_bednets(
      parameters = params,
      timesteps = bednet_timesteps,
      coverages = rep(0.5, length(bednet_timesteps)),    # set intervention coverage
      retention = 5 * year,
      dn0 = matrix(rep(0.387, length(bednet_timesteps)), nrow = length(bednet_timesteps), ncol = 1),
      rn = matrix(rep(0.563, length(bednet_timesteps)), nrow = length(bednet_timesteps), ncol = 1),
      rnm = matrix(rep(.24, length(bednet_timesteps)), nrow = length(bednet_timesteps), ncol = 1),
      gamman = rep(2.64 * 365, length(bednet_timesteps))
    )
  }

  # SMC ----------
  if(int1 == "smc" | int2 == "smc"){
    params <- set_smc(
      parameters = params,
      drug = 2,
      timesteps = seq(1, runtime, year),
      coverages = rep(0.5, length(seq(1, runtime, year))),
      min_ages = rep((0.25 * year), length(seq(1, runtime, year))),
      max_ages = rep((5 * year), length(seq(1, runtime, year))))
  }

  # RTS,S ----------
  if(int1 == "pev" | int2 == "pev"){
    params <- set_mass_pev(
      parameters = params,
      profile = rtss_profile, 
      timesteps = seq(1, runtime, year), 
      coverages = rep(0.5, length(seq(1, runtime, year))),
      min_ages = rep(round(5 * month), length(seq(1, runtime, year))),
      max_ages = rep(round(17 * month), length(seq(1, runtime, year))),
      min_wait = 0,
      booster_spacing = 365,
      booster_coverage = matrix(0.80, nrow =  length(seq(1, runtime, year)), ncol = 1), 
      booster_profile = list(rtss_booster_profile)
    )
  }

  # outcome definitions ----------
  # incidence for every 5 year age group
  params$clinical_incidence_rendering_min_ages = c(c(0, 0) * year, round(0.25 * year), round(5 * month))
  params$clinical_incidence_rendering_max_ages = c(c(5, 200) * year, 5 * year, round(17 * month))
  params$severe_incidence_rendering_min_ages = c(0, 0) * year
  params$severe_incidence_rendering_max_ages = c(5, 200) * year

  # prevalence 2-10 year olds
  params$prevalence_rendering_min_ages = 2 * year
  params$prevalence_rendering_max_ages = 10 * year

  # EIR equilibrium ----------
  params <- set_equilibrium(params, 20)

  # set correlations and run intervention

  # anti correlation
  set.seed(123)
  correlations <- get_correlation_parameters(params)
  correlations$inter_intervention_rho(int1, int2, -1)
  correlations$inter_round_rho(int1, 1)
  correlations$inter_round_rho(int2, 1)

  set.seed(123)
  output1 <- run_simulation(
    timesteps = runtime,
    parameters = params,
    correlations = correlations
  )

  # no correlation
  set.seed(123)
  correlations <- get_correlation_parameters(params)
  correlations$inter_intervention_rho(int1, int2, 0)
  correlations$inter_round_rho(int1, 1)
  correlations$inter_round_rho(int2, 1)

  set.seed(123)
  output2 <- run_simulation(
    timesteps = runtime,
    parameters = params,
    correlations = correlations
  )

  # full correlation
  set.seed(123)
  correlations <- get_correlation_parameters(params)
  correlations$inter_intervention_rho(int1, int2, 1)
  correlations$inter_round_rho(int1, 1)
  correlations$inter_round_rho(int2, 1)

  set.seed(123)
  output3 <- run_simulation(
    timesteps = runtime,
    parameters = params,
    correlations = correlations
  )

  # bind results
  output_all <- full_join(
    output1 |> mutate(correlation = "-1"),
    output2 |> mutate(correlation = "0")) |>
    full_join(output3 |> mutate(correlation = "1")) |>
    mutate(testrun = paste0(int1, "_", int2))

  # make list of vars specific to that intervention combo
  if(int1 == "bednets" & int2 == "smc") {int_vars <- c("n_net", "n_smc_treated")
  int_vars2 <- c("n_combined_smc_bednets")}

  if(int1 == "bednets" & int2 == "pev")  {int_vars <- c("n_net", "n_pev_mass_dose_3")
  int_vars2 <- c("n_combined_pev_bednets")}

  if(int1 == "pev" & int2 == "smc") {int_vars <- c("n_smc_treated", "n_pev_mass_dose_3")
  int_vars2 <- c("n_combined_pev_smc")}

  output_all <- output_all |>

    # statistics by month
    mutate(year = ceiling(timestep/year),
           month = ceiling(timestep/month)) |>

    # take means of populations and sums of cases by month
    group_by(month, year, correlation) |>

    summarize(across(c(n_0_1825, n_0_73000, n_730_3650, 
                       n_91_1825, n_152_517,
                       n_detect_730_3650, all_of(int_vars2)), ~ mean(.x, na.rm = TRUE)),
              across(c(n_inc_clinical_0_1825, n_inc_clinical_0_73000, 
                       n_inc_clinical_91_1825, n_inc_clinical_152_517,
                       n_inc_severe_0_1825, n_inc_severe_0_73000,
                       n_treated, n_infections, all_of(int_vars)), ~ sum(.x, na.rm = TRUE)))

  return(output_all)

}

dat1 <- run_correlation("bednets", "smc")
dat2 <- run_correlation("bednets", "pev")
dat3 <- run_correlation("pev", "smc")

monthyear <- seq(1, runtime / month, 12)

dat_all <- dat1 |> mutate(name = "smc_bednet") |>
  full_join(dat2 |> mutate(name = "pev_bednet")) |>
  full_join(dat3 |> mutate(name = "pev_smc")) |>
  ungroup() |>
  filter((name == "smc_bednet" & month == 1) | (name != "smc_bednet" & month %in% c(1,2,3,4))) |>
  group_by(name, correlation) |>
  summarize(n_net = sum(n_net),
            n_smc = sum(n_smc_treated),
            n_pev = sum(n_pev_mass_dose_3),
            n_smc_net = sum(n_combined_smc_bednets),
            n_pev_net = sum(n_combined_pev_bednets),
            n_pev_smc = sum(n_combined_pev_smc)) |>
  pivot_longer(cols = n_net:n_pev_smc, names_to = "variable", values_to = "value")

ggplot(data = dat_all |> filter(grepl("n_", variable))) +
  geom_col(aes(x = variable, y = value, fill = name, group = variable), position = position_dodge()) +
  facet_grid(cols = vars(correlation), rows = vars(name)) + 
  theme_classic() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) 

# the number of those receiving the combined intervetion should = the n with smc or pev
dat_all |>
  filter(correlation == 1) |>
  filter(!is.na(value))
``

plietar commented 4 months ago

Could this be caused by mortality? An individual's state gets reset when they die (in the reset_target function), including their intervention state.

Some people get intervention A. This number gets reported accurately in the n_net (or equivalent) output.
A few that received the intervention die. Their status for intervention A is cleared.
More people get intervention B. Again this is reported in, eg. n_smc_treated.
Individuals marked as having received both interventions are recorded as n_combined_smc_bednets

That final output excludes any individuals that died at step 2.

Actually net might have been a bad example here since I later realised it gets reset when the net is thrown away rather than at death. The general idea still applies though.

plietar commented 4 months ago

Digging around a bit more, a few more factors that may be relevant:

The n_smc and n_combined_smc_xxx outputs have a slight discrepancy. The n_smc includes everyone who received the drug, whereas n_combined_smc_xxx is based on the smc_time variable, which only includes those for whom the drug worked. Setting the drug efficacy to 1 makes that discrepancy go away. Presumably a separate variable for smc_received_time would be better. See the create_mda_listeners function.
The net_time variable gets reset when the net is thrown away rather than at death. This will affect the correlation outputs that depend on it. See throw_away_nets
The process of averaging the combined variables over the given month feels a bit unreliable. Might be best to changed the combined variables to only report whether someone received the intervention in the last ~30 days (instead of 365) and use the last value for each month. There's probably a lot of corner cases to watch out for, especially not all months are 30 days.

As far as I can tell the actual simulation behaviour and correlation works, it is the reporting that is a inconsistent and/or ambiguous.

htopazian commented 4 months ago

Thanks very much for the input, Paul! You make some very good points 😄 I'll have to reconfigure the combined output variables and the time window in which they are calculated.

mrc-ide / malariasimulation

setting completely correlated interventions does not result in the expected coverage #307