epiverse-trace / epidemics

A library of published compartmental epidemic models, and classes to represent demographic structure, non-pharmaceutical interventions, and vaccination regimes, to compose epidemic scenarios.
https://epiverse-trace.github.io/epidemics/
Other
8 stars 2 forks source link

Default model with odin-backend example #222

Open pratikunterwegs opened 5 months ago

pratikunterwegs commented 5 months ago

Followin from a meeting with @adamkucharski @sbfnk and @MJomaba and @LloydChapman, this issue is to request an implementation of the default model using {odin} as a backend. Having the implementation as a vignette initially could be a good starting point and help explore how the two packages could be integrted.

We'd want to see the current epidemics features preserved:

  1. Age stratification with age-dependent social contacts;
  2. Time-dependence on infection parameters;
  3. Interventions on social contacts, and vaccination specified by a <vaccination>.
pratikunterwegs commented 5 months ago

Here's a brief initial reprex for an SIR model with age-stratified social contacts, and a crude intervention on the transmission rate. Proof of concept only, but SEIR is only a few changes away.

Some thoughts:

  1. Usability could be improved by having some way of passing the initial conditions more flexibly, as they are (have to be?) hardcoded currently - will look into this;
  2. I think it should be possible to pass time-limitation flags as 2D or 3D arrays to implement interventions and vaccination regimes, but passing the {epidemics} classes themselves might be iffy.
library(odin)
library(socialmixr)
#> 
#> Attaching package: 'socialmixr'
#> The following object is masked from 'package:utils':
#> 
#>     cite
library(data.table)
library(ggplot2)

# get social contacts data
polymod <- socialmixr::polymod
contact_data <- socialmixr::contact_matrix(
  polymod,
  countries = "United Kingdom",
  age.limits = c(0, 20, 40),
  symmetric = TRUE
)
#> Using POLYMOD social contact data. To cite this in a publication, use the 'cite' function
#> Removing participants that have contacts without age information. To change this behaviour, set the 'missing.contact.age' option

# assume arbitrary populations for each demo group
population_size <- rep(1e6, 3)

# prepare contact matrix, divide by leading eigenvalue and rowwise by popsize
contact_matrix <- t(contact_data[["matrix"]])
contact_matrix <- (contact_matrix / max(Re(eigen(contact_matrix)$values))) /
  population_size

# View
contact_matrix
#>                  
#> contact.age.group         [,1]         [,2]         [,3]
#>           [0,20)  6.681250e-07 2.367991e-07 1.326870e-07
#>           [20,40) 2.644325e-07 4.114381e-07 2.224524e-07
#>           40+     2.596591e-07 3.898319e-07 4.242123e-07

# define SIR model
sir <- odin::odin({
  # NOTE: crude time-dependence of transmission rate
  # similar to a `<rate_intervention>`
  beta_ <- if (t > 210 && t < 300) beta / 2 else beta

  # number of age groups taken from contact matrix passed by user
  n <- user()

  # FOI is contacts * infectious * transmission rate
  # returns a matrix, must be converted to a vector/1D array
  lambda_prod[, ] <- C[i, j] * I[j] * beta_
  lambda[] <- sum(lambda_prod[i, ])

  ## Derivatives
  deriv(S[]) <- -S[i] * lambda[i]
  deriv(I[]) <- (S[i] * lambda[i]) - (gamma * I[i])
  deriv(R[]) <- gamma * I[i]

  ## Initial conditions: seems these cannot be passed by user?
  initial(S[]) <- 1e6 - 1
  initial(I[]) <- 1
  initial(R[]) <- 0

  ## parameters
  beta <- 1.3 / 7
  gamma <- 1 / 7
  C[, ] <- user() # user defined contact matrix

  # dimensions - all rely on contact matrix
  dim(lambda_prod) <- c(n, n)
  dim(lambda) <- n
  dim(S) <- n
  dim(I) <- n
  dim(R) <- n
  dim(C) <- c(n, n)
})
#> Loading required namespace: pkgbuild
#> Generating model in c
#> ℹ Re-compiling odin955a4062 (debug build)
#> ── R CMD INSTALL ───────────────────────────────────────────────────────────────
#> * installing *source* package ‘odin955a4062’ ...
#> ** using staged installation
#> ** libs
#> using C compiler: ‘Apple clang version 15.0.0 (clang-1500.0.40.1)’
#> using SDK: ‘MacOSX14.0.sdk’
#> clang -arch arm64 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/opt/R/arm64/include    -fPIC  -falign-functions=64 -Wall -g -O2  -UNDEBUG -Wall -pedantic -g -O0 -c odin.c -o odin.o
#> clang -arch arm64 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/opt/R/arm64/include    -fPIC  -falign-functions=64 -Wall -g -O2  -UNDEBUG -Wall -pedantic -g -O0 -c registration.c -o registration.o
#> clang -arch arm64 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/opt/R/arm64/lib -o odin955a4062.so odin.o registration.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
#> ld: warning: -single_module is obsolete
#> ld: warning: -multiply_defined is obsolete
#> installing to /private/var/folders/3t/nk820_1d4fvf8r90j7j1ldbr0000gp/T/RtmpKSY1jr/devtools_install_70bc1415b97d/00LOCK-file70bc4d32de76/00new/odin955a4062/libs
#> ** checking absolute paths in shared objects and dynamic libraries
#> * DONE (odin955a4062)
#> ℹ Loading odin955a4062

# initialise model and run over time 0 - 600
mod <- sir$new(C = contact_matrix, n = nrow(contact_matrix))
t <- seq(0, 600)
y <- mod$run(t)

# convert to data.table and plot infectious in each age class
y <- as.data.table(y)
y <- melt(y, id.vars = c("t"))

ggplot(y[variable %like% "I"]) +
  geom_line(aes(t, value, col = variable)) +
  facet_grid(~variable)

Created on 2024-04-25 with reprex v2.0.2

adamkucharski commented 5 months ago

Thanks for laying this out. It should be possible to pass input parameters, e.g. see early code in this vignette @lloydchapman put together and corresponding odin model.

Perhaps a starting point would be to implement the 'getting started' {epidemics} model as {odin} (or a similar vignette if that's easier/quicker), so can check outputs all match, then identify places we could streamline or improve integration?

pratikunterwegs commented 5 months ago

Thanks, got it. One thing that's seem to be more difficult than I initially thought is event handling of time-limited interventions. As in the example, it's relatively easy to pass a single uniform intervention within a single range, but I'm unsure whether the DSL allows for multiple overlapping interventions affecting age groups differently (which we'd want for policy-relevant scenarios e.g. school/work closures). Happy to get any pointers on this from @LloydChapman!

adamkucharski commented 5 months ago

Thanks. So sounds like it should be feasible to reproduce the basic README example while we dig more into how to implement overlapping interventions?

It must be doable in theory given {sircovid} etc. (I'm not sure how much overlap we had with interventions in FP analysis). But, of course, depends how much of this needs to be hard coded in DSL, versus the flexibility of defining as inputs for current {epidemics} models

pratikunterwegs commented 5 months ago

Yes, that's the example shown as a reprex here already. Since @avallecam is also referencing this issue might it be better to add this to the How To repo instead? Would also save on adding dependencies.

If I'm right sircovid implemented interventions as time-dependence on transmission rate, see linked issue and related project(as in the reprex) rather than group specific contact reduction. That might be doable but equally I think it's where epidemics steals a march.

adamkucharski commented 5 months ago

Yep, howto seems like good home for now. For full consistency, would be useful to see if it's possible for above reprex to be wrapped into a function call for existing script (rather than needing ICs, parameters etc. to be defined in the odin code), e.g. so below function calls a compiled odin backend, rather than current one:

model_default_odin(
  population = uk_population,
  transmission_rate = 1.5 / 7.0,
  infectiousness_rate = 1.0 / 3.0,
  recovery_rate = 1.0 / 7.0,
  intervention = list(contacts = close_schools),
  time_end = 600, increment = 1.0
)
pratikunterwegs commented 5 months ago

Sounds good, I'll figure out the wrapping and make a PR to {howto}.

LloydChapman commented 5 months ago

Thanks, got it. One thing that's seem to be more difficult than I initially thought is event handling of time-limited interventions. As in the example, it's relatively easy to pass a single uniform intervention within a single range, but I'm unsure whether the DSL allows for multiple overlapping interventions affecting age groups differently (which we'd want for policy-relevant scenarios e.g. school/work closures). Happy to get any pointers on this from @LloydChapman!

Good question. From looking through the Imperial papers, it looks like they never actually modelled interventions like school closures as affecting different age groups differently (they just changed the overall transmission rate coefficient). It should be perfectly possible to do, though, by making the "contact" matrix (or strictly speaking "transmission" matrix) in the code a time-dependent function of some user inputs, but then you do need to hard code this in the model. There's some flexibility in terms of how many changepoints you have and how "parameter" values change, as you can specify a vector/list of input values for the parameter/contact matrix at the changepoints and then convert these to a time-dependent function as e.g. here and here, where I'm pretty sure the function that does the conversion can be anything that interpolates the values at the changepoints over all the simulated time points.

pratikunterwegs commented 5 months ago

Thanks @LloydChapman, I'll take a look but as I'm not an {odin} user the linked examples might be difficult to understand right away - could you provide a reprex of what you mean? Just to clarify, does {odin} support matrix operations and function definitions within the model code?

sbfnk commented 4 months ago
  ## Initial conditions: seems these cannot be passed by user?

I think you can just define e.g. a user-defined matrix and then use that, i.e.

  init[, ] <- user()
  dim(init) <- c(3, n)
  initial(S[]) <- init[1, i]
  initial(I[]) <- init[2, i]
  initial(R[]) <- init[3, i]

and then in the call

mod <- sir$new(
  C = contact_matrix,
  n = nrow(contact_matrix),
  init = matrix(c(rep(1e6 - 1, 3), rep(1, 3), rep(0, 3)), byrow = TRUE, ncol = 3)
)
sbfnk commented 4 months ago

Thanks @LloydChapman, I'll take a look but as I'm not an {odin} user the linked examples might be difficult to understand right away - could you provide a reprex of what you mean? Just to clarify, does {odin} support matrix operations and function definitions within the model code?

Don't know much about odin but looking at your existing code could you do something like the following?

  C[, ] <- user() # user defined contact matrix
  C_[, ] <- C[i, j]
  C_[1, 1] <- if (t > 210 && t < 300) C[1, 1] / 2 else C[1, 1]
adamkucharski commented 4 months ago

The {seir} package vignette has also been updated (thanks @LloydChapman for also fixing the fitting issue!) so hopefully clearer here how to pass pre-defined inputs into model: https://github.com/LloydChapman/seir/blob/main/vignettes/fit_seirdage.Rmd

pratikunterwegs commented 4 months ago

Thanks both - will try your suggestion Seb, and also look into the updated {seir} vignettes.

pratikunterwegs commented 4 months ago

Logging some observations on developing with {odin} and {Rcpp} in the same package (from earlier Slack chain):

  1. Running odin::odin_package(".") to generate src/odin.c and src/registration.c removes autogenerated static const R_CallMethodDef CallEntries[] ... from src/RcppExports.cpp as the call entries are re-defined in src/registration.c.
  2. Removing src/registration.c and re-running Rcpp::compileAttributes() regenerates the {odin}-generated call entries in src/RcppExports.cpp, and includes the RcppExport-ed from other functions in src/ as well.
  3. Then, running devtools::document() removes the RcppExport-ed call entries from src/RcppExports.cpp (same state as step 1).
  4. Code generated using {odin} does not seem to be documentable using {roxygen2}, and likely requires floating documentation with manually specified names.
adamkucharski commented 2 months ago

I've looked into {odin} in more detail, building on this implementation adapted from @pratikunterwegs 's howto example, which now reproduces the basic README example with one intervention.

It's possible to define multiple overlapping interventions with some matrix indexing and using log transformed to add up their multiplicative effect. Here is an outline in Rmd format of the model that reproduces the multiple interventions vignette.

There's still some work to do, e.g. adding input validation in model_default_odin, so will add issues for these, but I think this could be a nice accessible way for users to edit {epidemics} models, plus looks like it can open up route to multiple overlapping vaccine campaigns, stochastic implementation via {odin.dust} without redefining model equations etc.

LloydChapman commented 1 month ago

Thanks @LloydChapman, I'll take a look but as I'm not an {odin} user the linked examples might be difficult to understand right away - could you provide a reprex of what you mean? Just to clarify, does {odin} support matrix operations and function definitions within the model code?

Hiya. Sorry for the very slow reply. The way you've defined multiple overlapping interventions, @adamkucharski, and the maths with the log transformation (https://github.com/epiverse-trace/epidemics/issues/222#issuecomment-2245138546) look good to me. For interventions targeted at a specific age group, @sbfnk is essentially right (https://github.com/epiverse-trace/epidemics/issues/222#issuecomment-2112863813). I've put together a reprex:

library(odin)
library(odin.dust)

seiragetimedepcontact <- odin_dust({
    ## Definition of the time-step and output as "time" 
    dt <- user()
    initial(time) <- 0
    update(time) <- (step + 1) * dt
    steps_per_day <- 1/dt

    ## Core equations for transitions between compartments: 
    update(S_tot) <- S_tot - sum(n_SE)
    update(E_tot) <- E_tot + sum(n_SE) - sum(n_EI)
    update(I_tot) <- I_tot + sum(n_EI) - sum(n_IR)
    update(R_tot) <- R_tot + sum(n_IR)

    ## Equations for transitions between compartments by age group
    update(S[]) <- S[i] - n_SE[i]
    update(E[]) <- E[i] + n_SE[i] - n_EI[i]
    update(I[]) <- I[i] + n_EI[i] - n_IR[i]
    update(R[]) <- R[i] + n_IR[i]

    ## Individual probabilities of transition:
    p_SE[] <- 1 - exp(-lambda[i] * dt) # S to E
    p_EI <- 1 - exp(-sigma * dt) # E to I
    p_IR <- 1 - exp(-gamma * dt) # I to R

    ## Force of infection
    m_step[,,] <- user() # array of time-dep age-structured contact matrices
    m[,] <- if (as.integer(step) >= dim(m_step,3))
        m_step[i, j, dim(m_step,3)] else m_step[i, j, step + 1]
    s_ij[, ] <- m[i, j] * I[i]
    lambda[] <- beta * sum(s_ij[, i])

    ## Draws from binomial distributions for numbers changing between 
    ## compartments:
    n_SE[] <- rbinom(S[i], p_SE[i])
    n_EI[] <- rbinom(E[i], p_EI)
    n_IR[] <- rbinom(I[i], p_IR)

    ## Initial conditions 
    initial(S_tot) <- sum(S_ini) 
    initial(E_tot) <- sum(E_ini)
    initial(I_tot) <- sum(I_ini) 
    initial(R_tot) <- 0

    initial(S[]) <- S_ini[i]
    initial(E[]) <- E_ini[i]
    initial(I[]) <- I_ini[i]
    initial(R[]) <- 0

    ## Store infection and death incidence
    initial(I_inc) <- 0
    update(I_inc) <- if (step %% steps_per_day == 0) sum(n_EI) else I_inc + sum(n_EI)

    ## Model parameters (default in parenthesis) 
    S_ini[] <- user()
    E_ini[] <- user()
    I_ini[] <- user()
    beta <- user(0.0165)
    sigma <- user(0.1)
    gamma <- user(0.1)

    ## Array dimensions
    N_age <- user()
    dim(S_ini) <- N_age
    dim(E_ini) <- N_age
    dim(I_ini) <- N_age
    dim(S) <- N_age
    dim(E) <- N_age
    dim(I) <- N_age
    dim(R) <- N_age
    dim(n_SE) <- N_age
    dim(n_EI) <- N_age
    dim(n_IR) <- N_age
    dim(p_SE) <- N_age
    dim(lambda) <- N_age
    dim(m_step) <- user()
    dim(m) <- c(N_age, N_age)
    dim(s_ij) <- c(N_age, N_age)
})

# Get age-structured contact matrix
data(polymod, package = "socialmixr")

age.limits <- seq(0,70,10)
contact <- socialmixr::contact_matrix(survey = polymod,countries = "United Kingdom",age.limits = age.limits,symmetric = T)

# Transform the matrix to the (symmetrical) transmission matrix
# rather than the contact matrix. This transmission matrix is
# weighted by the population in each age band.
transmission <- contact$matrix/rep(contact$demography$population, each = ncol(contact$matrix))
transmission

# Set up model
N_age <- length(age.limits)
dt <- 0.25
S_ini <- contact$demography$population
E_ini <- c(0,0,0,0,0,0,0,0)
I_ini <- c(0,10,0,0,0,0,0,0)

# Run epidemic forward
n_steps <- 1000

# Make array of transmission matrices for time-dependent contact rates
# Introduce an intervention that reduces contacts in school-aged children
# (taken as 0-20-year-olds in this model) by 90%
transmission_reduced_school_contacts <- transmission
transmission_reduced_school_contacts[1:2,1:2] <- transmission[1:2,1:2]/10
# Set intervention to start 80 days into simulation and stop after 125 days
transmission_time <- c(0, 50, 125)
transmission_value <- array(c(transmission,
                              transmission_reduced_school_contacts,
                              transmission), dim = c(N_age, N_age, 3))

# N.B. Could do this with a fancier function that interpolates the values across 
# the slices of the array transmission_value, but keeping it simple here for 
# demo purposes
transmission_step <- 
    array(NA, dim = c(N_age, N_age, 
                      round(transmission_time[length(transmission_time)]/dt)+1L))
for (i in seq_along(transmission_time)[-length(transmission_time)]){
    start_step <- transmission_time[i]/dt + 1
    end_step <- (transmission_time[i+1])/dt
    transmission_step[,,start_step:end_step] <- transmission_value[,,i]
}
transmission_step[,,dim(transmission_step)[3]] <- 
    transmission_value[,,dim(transmission_value)[3]]

model <- seiragetimedepcontact$new(
    list(dt = dt,S_ini = S_ini,E_ini = E_ini,I_ini = I_ini,
         beta = 0.03,sigma = 0.5,gamma = 0.1,
         m_step = transmission_step,N_age = N_age),
    step = 0,n_particles = 1,n_threads = 1,seed = 1)

# Create an array to contain outputs after looping the model.
x <- array(NA, dim = c(model$info()$len, 1, n_steps+1))

# For loop to run the model iteratively
x[ , ,1] <- model$state()
for (t in seq_len(n_steps)) {
    x[ , ,t+1] <- model$run(t)
}
time <- x[1,1,-1]

# Drop time row
x <- x[-1, , ,drop=FALSE]

# Plot trajectories
par(mfrow = c(2,4), oma=c(2,3,0,0))
for (i in 1:N_age) {
    par(mar = c(3, 4, 2, 0.5))
    cols <- c(S = "#8c8cd9", E = "#ffa500", I = "#cc0044", R = "#999966")
    matplot(time, x[i + 5, ,-1], type = "l", # Offset to access numbers in age compartment
            xlab = "", ylab = "", yaxt="none", main = paste0("Age ", contact$demography$age.group[i]),
            col = cols[["S"]], lty = 1, ylim=range(x[-1:-4,,]))
    matlines(time, x[i + 5 + N_age, ,-1], col = cols[["E"]], lty = 1)
    matlines(time, x[i + 5 + 2*N_age, ,-1], col = cols[["I"]], lty = 1)
    matlines(time, x[i + 5 + 3*N_age, ,-1], col = cols[["R"]], lty = 1)
    legend("right", lwd = 1, col = cols, legend = names(cols), bty = "n")
    axis(2, las =2)
}
mtext("Number of individuals", side=2,line=1, outer=T)
mtext("Time", side = 1, line = 0, outer =T)

This takes a 3D array (you could alternatively use a list) of transmission matrices (scaled contact matrices), transmission_value, and a vector of time changepoints representing the starts and ends of the intervention periods, transmission_time, and converts them into a 3D array for the transmission matrix at all steps in the simulation (up to the last changepoint), transmission_step, with step changes between the transmission matrix entries at the changepoints (you could use a fancier function with smooth changes).

{odin} doesn't support function definitions in the model code or really matrix/array operations (apart from summing over rows/columns/slices).