epiverse-trace / epidemics

A library of published compartmental epidemic models, and classes to represent demographic structure, non-pharmaceutical interventions, and vaccination regimes, to compose epidemic scenarios.
https://epiverse-trace.github.io/epidemics/
Other
8 stars 2 forks source link

Develop model_default_odin() for compatibility with vignettes #254

Open adamkucharski opened 1 month ago

adamkucharski commented 1 month ago

We have an initial example of {odin} implementation of the overlapping interventions vignette on the odin-default-model branch.

The next step will to add the relevant input checking and output formatting so model_default_odin() behaves the same as model_default() (i.e. could swap out in the vignette and user wouldn't notice if both compiled models provided).

bahadzie commented 4 weeks ago

Input checking and output formatting for model_default_odin() are done. I've added _test-model_defaultodin.R a duplicate of _test-modeldefault.R swapping out model_default() for model_default_odin() for all the test that don't generate a warning or error.

All the failing test are related to your slack comment about allowing vectorised inputs.

All commits are in the odin-default-model branch.

Below is the summary of devtools::test().

Passing tests

:white_check_mark: Default odin model: basic expectations, scalar arguments :white_check_mark: Default odin model: statistical correctness, parameters :white_check_mark: Default odin model: contacts interventions and stats. correctness :white_check_mark: Default odin model: rate interventions :white_check_mark: Default odin model: errors and warnings, scalar arguments :white_check_mark: Default odin model: errors on vectorised input

Warning test

:warning: Default model: vaccination and stats. correctness

DLSODA-  Warning..Internal T (=R1) and H (=R2) are
      such that in the machine, T + H = T on the next step  
     (H = step size). Solver will continue anyway.
In above message, R1 = 200, R2 = 4.47248e-15

Failing tests

:bug: Default model: time dependence

Failure (test-model_default_odin.R:295:3): Default model: time dependence
all(epidemic_size(data_baseline) > epidemic_size(data)) is not TRUE

`actual`:   FALSE
`expected`: TRUE

:bug: Default model: infection parameters as vectors

Failure (test-model_default_odin.R:423:3): Default model: infection parameters as vectors
Expected `model_default_odin(...)` to run without any conditions.
i Actually got a <simpleError> with text:
  Expected a scalar numeric for 'beta'

Error (test-model_default_odin.R:431:3): Default model: infection parameters as vectors
Error in `self$set_user(user = user, unused_user_action = unused_user_action)`: Expected a scalar numeric for 'beta'
Backtrace:
    ▆
 1. └─epidemics::model_default_odin(...) at test-model_default_odin.R:431:3
 2.   └─seirv_model$new(...) at epidemics/R/model_default.R:614:3
 3.     └─odin10ab6dad (local) initialize(...)
 4.       └─self$set_user(user = user, unused_user_action = unused_user_action) at filefb9c3b0b9ae8/R/odin.R:60:7

:bug: Default model: composable elements as lists

Failure (test-model_default_odin.R:483:3): Default model: composable elements as lists
Expected `model_default_odin(uk_population, intervention = npi_list)` to run without any conditions.
i Actually got a <simpleError> with text:
  Expected a scalar numeric for 'beta'

Error (test-model_default_odin.R:488:3): Default model: composable elements as lists
Error in `self$set_user(user = user, unused_user_action = unused_user_action)`: Expected a scalar numeric for 'beta'
Backtrace:
    ▆
 1. └─epidemics::model_default_odin(uk_population, intervention = npi_list) at test-model_default_odin.R:488:3
 2.   └─seirv_model$new(...) at epidemics/R/model_default.R:614:3
 3.     └─odin10ab6dad (local) initialize(...)
 4.       └─self$set_user(user = user, unused_user_action = unused_user_action) at filefb9c3b0b9ae8/R/odin.R:60:7

Failure (test-model_default_odin.R:536:3): Default model: multi-parameter, multi-composables
Expected `model_default_odin(...)` to run without any conditions.
i Actually got a <simpleError> with text:
  Expected a scalar numeric for 'beta'

:bug: Default model: multi-parameter, multi-composables

Failure (test-model_default_odin.R:536:3): Default model: multi-parameter, multi-composables
Expected `model_default_odin(...)` to run without any conditions.
i Actually got a <simpleError> with text:
  Expected a scalar numeric for 'beta'

Error (test-model_default_odin.R:545:3): Default model: multi-parameter, multi-composables
Error in `self$set_user(user = user, unused_user_action = unused_user_action)`: Expected a scalar numeric for 'beta'
Backtrace:
    ▆
 1. └─epidemics::model_default_odin(...) at test-model_default_odin.R:545:3
 2.   └─seirv_model$new(...) at epidemics/R/model_default.R:614:3
 3.     └─odin10ab6dad (local) initialize(...)
 4.       └─self$set_user(user = user, unused_user_action = unused_user_action) at filefb9c3b0b9ae8/R/odin.R:60:7
bahadzie commented 4 weeks ago

epidemics_odin.Rmd is a clone of epidemics.Rmd that compares model_default() with model_default_odin(). The results are equal to within a tolerance of 0.4.

adamkucharski commented 4 weeks ago

Nice work, thanks for putting together! Should be fine to close this issue now these have been added, once below point about potential mismatch clarified? Then can move follow up steps to future issues.

A few quick questions just to make sure I'm following the changes:

adamkucharski commented 4 weeks ago

Also noticed that we seem to have a mismatch in the outputs now. E.g. below using the original model_default:

Screenshot 2024-10-25 at 13 23 10

And this when we replace model_default with model_default_odin:

Screenshot 2024-10-25 at 13 22 26

The two outputs match when using the odin model in the modelling_scenarios_odin.Rmd example (which has below output):

Screenshot 2024-10-25 at 13 24 30

Any thoughts on what might have changed? From a quick look, it may be to do with the contact matrix normalisation in the correct vignette odin model, which doesn't seem to appear in model_default_odin?

  contact_matrix_norm <- population$contact_matrix
  contact_matrix_norm <- (contact_matrix_norm / max(Re(eigen(contact_matrix_norm)$values))) /
  population$demography_vector
bahadzie commented 3 weeks ago

Any thoughts on what might have changed? From a quick look, it may be to do with the contact matrix normalisation in the correct vignette odin model, which doesn't seem to appear in model_default_odin?

Contact matrix normalisation code is in .prepare_population() which is called from .check_prepare_args_default() which in turn is called from model_default() and model_default_odin() respectively.

I mirrored the first 122 lines of model_default() in model_default_odin() because

Another potential source of the difference can be in the way odin expects its parameters to be supplied.

bahadzie commented 3 weeks ago

Does seirv_model in model_default.R compile using {odin} when {epidemics} is first installed? A useful next step for useability could be to precompile the model so it can be loaded with the package, like in the {seir} package (unless I've missed somewhere this is already happening).

I think from a user's perspective it'll compile on library() because in development it compiles on devtools::load_all() I don't think {seir} uses odin even though it is mention in README.md it doesn't appear in DESCRIPTION or NAMESPACE. They have a /src folder with cpp files which I suspect are generated using {dust} but {dust} is not currently on CRAN. The seirv model is simple enough to compile quickly on library() and after that it's cached.

Am I correct that the vectorisation in the original model_default() is handled with apply() and Map() functions in the R function, rather than in C++ backend? If so, this should be more straightforward to add this feature to model_default_odin() (can make separate issue for this).

Yes. My preference would be to rename this issue and keep all things model_default_odin together here.

adamkucharski commented 3 weeks ago

Contact matrix normalisation code is in .prepare_population() which is called from .check_prepare_args_default() which in turn is called from model_default() and model_default_odin() respectively.

I mirrored the first 122 lines of model_default() in model_default_odin() because

* it deals with parameter checking and preparation

* didn't want to miss edge cases

* keeping it constant means differences are downstream (solvers/output formatting)

Another potential source of the difference can be in the way odin expects its parameters to be supplied.

I revisited the tolerance check in epidemics_odin.Rmd to understand where difference might be. This is the entry with maximum difference in the output values, which corresponds to a very large difference in the epidemic dynamics:

> output[4548,]
    time demography_group compartment    value
   <num>           <char>      <char>    <num>
1:   303              40+ susceptible 27816495
> output_odin[4548,]
    time demography_group compartment    value
   <num>           <char>      <char>    <num>
1:   303              40+ susceptible 16413209

Noticed there was a discrepancy in the default R0 for the two models (1.5 vs 1.3 - also relates to issue #253 on misdescribed default in another vignette). Have pushed a fix and now this returns all.equal(output, output_odin)=T. The multiple interventions vignette also now outputting correctly (see below). So maybe we need stricter checks on model output tolerances?

Screenshot 2024-10-26 at 17 26 20
adamkucharski commented 3 weeks ago

I think from a user's perspective it'll compile on library() because in development it compiles on devtools::load_all() I don't think {seir} uses odin even though it is mention in README.md it doesn't appear in DESCRIPTION or NAMESPACE. They have a /src folder with cpp files which I suspect are generated using {dust} but {dust} is not currently on CRAN. The seirv model is simple enough to compile quickly on library() and after that it's cached.

Thanks. Yep, {seir} uses odin.dust to compile odin models to dust (e.g. SIR model). But agree that as only {odin} on CRAN, makes sense to focus on this. And having run with fresh install, compilation is nice and quick (actually, seems smoother than the original model_default C++ version).

Yes. My preference would be to rename this issue and keep all things model_default_odin together here.

Sounds good – shall we just rename title and create new to-do list in thread?

Yes please

bahadzie commented 3 weeks ago

Any thoughts on how to fix the warning. I spent all day trying to debug it. No luck yet.

:warning: Default model: vaccination and stats. correctness

DLSODA-  Warning..Internal T (=R1) and H (=R2) are
     such that in the machine, T + H = T on the next step  
    (H = step size). Solver will continue anyway.
In above message, R1 = 200, R2 = 4.47248e-15

The code below (adapted from the test file test-model_default_odin.R includes everything required to replicate it.

rm(list = ls())
devtools::load_all()

generate_population <- function() {
  # Prepare contact matrix and demography vector
  polymod <- socialmixr::polymod
  contact_data <- socialmixr::contact_matrix(
    polymod,
    countries = "United Kingdom",
    age.limits = c(0, 60),
    symmetric = TRUE
  )
  contact_matrix <- t(contact_data$matrix)
  demography_vector <- contact_data$demography$population

  # make initial conditions - order is important
  initial_conditions <- c(
    S = 1 - 1e-6, E = 0,
    I = 1e-6, R = 0, V = 0
  )
  initial_conditions <- rbind(
    initial_conditions,
    initial_conditions
  )

  # create a population
  population(
    name = "UK population",
    contact_matrix = contact_matrix,
    demography_vector = demography_vector,
    initial_conditions = initial_conditions
  )
}
uk_population <- generate_population()

data_baseline <- model_default(uk_population)
data_baseline_odin <- model_default_odin(uk_population)
all.equal(data_baseline, data_baseline_odin)
## [1] TRUE

all.equal(epidemic_size(data_baseline), epidemic_size(data_baseline_odin))
## [1] "Mean relative difference: 3.632693e-07"

single_vaccination <- vaccination(
  name = "double_vaccination",
  nu = matrix(1e-3, nrow = 2),
  time_begin = matrix(0, nrow = 2),
  time_end = matrix(100, nrow = 2)
)
data <- model_default(uk_population, vaccination = single_vaccination)
data_odin <- model_default_odin(uk_population, vaccination = single_vaccination)
all.equal(data, data_odin)
## [1] "Column 'value': Mean relative difference: 1.898998"

all.equal(epidemic_size(data), epidemic_size(data_odin))
## [1] "Mean relative difference: 0.9805965"

high_rate_vax <- vaccination(
  time_begin = matrix(200, nrow(uk_population$contact_matrix)),
  time_end = matrix(200 + 150, nrow(uk_population$contact_matrix)),
  nu = matrix(0.01, nrow = nrow(uk_population$contact_matrix))
)
data_high_rate_vax <- model_default(
  uk_population,
  vaccination = high_rate_vax, time_end = 600
)
data_high_rate_vax_odin <- model_default_odin(
  uk_population,
  vaccination = high_rate_vax, time_end = 600
)
## Warning messages:
## 1: In lsoda(y, times, func, parms, ...) :
##   an excessive amount of work (> maxsteps ) was done, but integration was not successful - increase maxsteps
## 2: In lsoda(y, times, func, parms, ...) :
##   Returning early. Results are accurate, as far as they go

c(nrow(data_high_rate_vax), nrow(data_high_rate_vax_odin))
## [1] 6010 2010

rows_to_compare <- seq_len(nrow(data_high_rate_vax_odin))

all.equal(
  data_high_rate_vax[rows_to_compare, ],
  data_high_rate_vax_odin[rows_to_compare, ]
)
## [1] "Column 'value': Mean relative difference: 1.666696e-05"

all.equal(
  epidemic_size(data_high_rate_vax[rows_to_compare, ]),
  epidemic_size(data_high_rate_vax_odin[rows_to_compare, ])
)
## [1] "Numeric: lengths (2, 0) differ"

I've also {microbenchmark}ed _modeldefault(slow) v _model_defaultodin(fast).

microbenchmark::microbenchmark(
  model_default(uk_population),
  model_default_odin(uk_population),
  model_default(uk_population, vaccination = single_vaccination),
  model_default_odin(uk_population, vaccination = single_vaccination)
)
## Unit: milliseconds
##                                                                 expr      min
##                                         model_default(uk_population) 4.819167
##                                    model_default_odin(uk_population) 2.531334
##       model_default(uk_population, vaccination = single_vaccination) 4.622542
##  model_default_odin(uk_population, vaccination = single_vaccination) 2.493667
##        lq     mean   median       uq       max neval
##  4.944188 5.187382 5.014063 5.147542 12.064501   100
##  2.601313 3.541654 2.691938 2.832730 70.017876   100
##  4.787147 4.943179 4.842876 4.941834  8.213376   100
##  2.605500 2.850439 2.681106 2.799334  6.832918   100
adamkucharski commented 3 weeks ago

Looks like that's happening because the underlying ODE solver has ended with a very small (~0) time step, likely as the result of a sudden change in dynamics that it's struggling to integrate – maybe from a large influx of vaccination? Strange that it doesn't seem to be an issue for the simple_vaccination example. Tagging @hillalex who may also have some thoughts.

adamkucharski commented 2 weeks ago

I spotted the problem by checking the values of vax_nu that were getting passed to {odin} via model_default_odin. Turns out in .check_prepare_args_default, it scales the vaccination rate to give the number of people vaccinated (see below code), whereas in odin model, we're currently implementing as a per-capita vaccination rate (i.e. proportion of S group vaccinated per unit time).

# calculate vax_nu as the NUMBER of vaccinations per timestep
# rather than a fraction/rate. See issue #198 for more details.
vax_nu <- vax_nu * mod_args[["population"]][["demography_vector"]]

So {odin} returns an error because we're removing susceptible at an order of magnitude too high rate, turning the ODEs into a system that can't be accurately solved.

I've pushed an edit to the {odin} model that I think fixes this, i.e. swapping removal rate from vax_rate[i]*S[i] to min(vax_rate[i],S[i]).

bahadzie commented 2 weeks ago

I'm getting a test failure when I tried swapping out model_default() with model_default_odin(). I've pushed a commit to see if this is the same behavior for others. devtools::test() output below.

Failure (test-model_default_odin.R:260:3): Default model: vaccination and stats. correctness Check on 'data[grepl("susceptible", data$compartment, fixed = TRUE), ]$value' failed: Element 661 is not >= 0 Backtrace: ▆

  1. └─checkmate::expect_numeric(...) at test-model_default_odin.R:260:3
  2. └─checkmate::makeExpectation(x, res, info, label)

[ FAIL 1 | WARN 0 | SKIP 0 | PASS 627 ]

bahadzie commented 2 weeks ago

I've pushed an edit to the {odin} model that I think fixes this, i.e. swapping removal rate from vax_rate[i]*S[i] to min(vax_rate[i],S[i]).

Don't think I understand why min?

We could try passing what the odin model expects instead of what the C++ implementation expects?

adamkucharski commented 2 weeks ago

Don't think I understand why min?

We could try passing what the odin model expects instead of what the C++ implementation expects?

The min is to ensure that we don't vaccinate more people than there are susceptibles remaining. Basically there are two main ways to implement vaccination: 1) vaccinate a relative % per day (i.e.vax_rate[i]*S[i]) or 2) vaccinate an absolute number per day (min(vax_rate[i],S[i]). Issue #198 points out that (2) is often more realistic, because we'll have a stock of vaccine we can distribute per day, rather than just aiming to reach a fixed % then stopping each day. So I'm happy to keep this new version for now - but can discuss possible extension options more with @alxsrobert

adamkucharski commented 2 weeks ago

I'm getting a test failure when I tried swapping out model_default() with model_default_odin(). I've pushed a commit to see if this is the same behavior for others. devtools::test() output below.

Looks like the issue is a very small amount of overshoot in the solver because min() used (i.e. it removes a large number of vaccinated individuals from S before the solver catches up to fact S is now slightly smaller). So we get a value of -3.5e-7 at the tail end of the epidemic.

Using a scaling like in issue #198 may fix this, but could introduce some solver ineffeciencies.

bahadzie commented 2 weeks ago

In the latest commit model_default_odin() now works with vector parameters. devtools::test() now passes the test below.

beta <- rnorm(10, 1.3 / 7, sd = 0.01)
sigma <- rnorm(10, 0.5, sd = 0.01)
gamma <- rnorm(10, 1 / 7, sd = 0.01)

test_that("Default odin model: infection parameters as vectors", {
adamkucharski commented 2 weeks ago

Thanks! I've run through the vignettes using a version of the package with the odin model (i.e. model_default = model_default_odin in model_default.R). Here's a summary of checks on which ones ran and produced figures that are the same (based on visual check) as the original model vignettes:

Modelling interventions

Time dependence

Advanced modelling

Package vignettes

Other observations

adamkucharski commented 2 weeks ago

I also noticed that the odin multiplicative interventions weren't consistent with the additive spec in the design principles (see issue #255) so have pushed a fix (which actually has the result of simplifying the odin code – although will still need a bit more thought for the modelling_rate_interventions component).