HopkinsIDD / flepiMoP

The Flexible Epidemic Modeling Pipeline
https://flepimop.org
GNU General Public License v3.0
9 stars 4 forks source link

[Bug]: Specifying a transition proportional to a sum of states only uses one of states included in the sum #299

Open twallema opened 1 month ago

twallema commented 1 month ago

Label

bug, gempyor, invalid

Priority Label

high priority

Describe the bug/issue

Specifying a transition proportional to a sum of states

To specify a transition from disease state 'X1' --> 'X2' proportional to the source and a sum of states 'Y1' and 'Y2', the following syntax can be placed in the config files,

  transitions:
    - source: ["X1"]
      destination: ["X2"]
      rate: ["beta"]
      proportional_to: ["source", ["Y1","Y2"]]
      proportion_exponent: ["1", "1"]

This syntax is useful in the FOI, for instance X1 = S, X2 = E, Y1 = I and Y2 = I_v (vaccination as a disease state).

See: https://iddynamics.gitbook.io/flepimop/gempyor/model-implementation/compartmental-model-structure

Testing the syntax

SIR model

Just your regular ol' SIR model with beta = 0.75 and gamma = 1/T_I = 1/5 = 0.2, resulting in R0 = 3.75. I used 333M inhabitants, of which 1000 infected and 1 recovered individual as the initial condition. The expected final epidemic size R_{inf} for an SIR model with an R0 = 3.75 is equal to 97.4%.

name: SIR
setup_name: USA
start_date: 2023-09-01
end_date: 2024-01-01
nslots: 1

subpop_setup:
  geodata: model_input/data/geodata_2019_statelevel.csv
  mobility: model_input/data/mobility_2011-2015_statelevel.csv

initial_conditions:
  method: plugin
  plugin_file_path: model_input/initial_conditions_SIR.py

seir:
  integration:
    method: rk4
    dt: 1.0
  parameters:
    beta:
      value: 0.75
    T_I:
      value: 5

  transitions:
    - source: ["S"]
      destination: ["I"]
      rate: ["beta"]
      proportional_to: ["S", "I"]
      proportion_exponent: ["1", "1"]
    - source: ["I"]
      destination: ["R"]
      rate: ["1/T_I"]
      proportional_to: ["I"]
      proportion_exponent: ["1"]

with initial_conditions_SIR.py,

import gempyor.initial_conditions
import numpy as np

class InitialConditions(gempyor.initial_conditions.InitialConditions):
    def get_from_config(self, sim_id: int, modinf) -> np.ndarray:

        # input: initial condition
        states = ["S", "I", "R"]
        n_0 = np.array([333.3e6-1e3-1, 1e3, 1])
        f = n_0/sum(n_0)

        # pre-allocate initial condition
        y0 = np.zeros((modinf.compartments.compartments.shape[0], modinf.nsubpops))

        # find indices of states we need to fill out
        indices = [modinf.compartments.get_comp_idx({"infection_stage": state}) for state in states]

        # output: initial susceptible and infected
        for j, idx in enumerate(indices):
            y0[idx, :] = modinf.subpop_pop * f[j]

        return y0

SI2R model

We contrast the SIR model with an SI2R model, where we apply the linear chain trick to the I compartment. The SI2R model differs from the SIR by splitting the I in I1 and I2 compartments with half the residence time as compared to the SIR's I compartment in each (2*gamma=0.4). Both I1 and I2 can infect the S with equal probability beta. I used 333M inhabitants, of which 500 in I1 and 500 in I2, along with 1 recovered individual as the initial condition. These models should have an identical outcome in a deterministic simulation, as only the distribution of the residence time in the I = I1 + I2 compartments are altered, and not the expected residence time. The FOI for this model is,

dS / dt = beta * S * (I1 + I2).

Implemented as (transitions only),

  transitions:
    - source: ["S"]
      destination: ["I1"]
      rate: ["beta"]
      proportional_to: ["source", ["I1","I2"]]
      proportion_exponent: ["1", "1"]
    - source: ["I1"]
      destination: ["I2"]
      rate: ["2/T_I"]
      proportional_to: ["source"]
      proportion_exponent: ["1"]
    - source: ["I2"]
      destination: ["R"]
      rate: ["2/T_I"]
      proportional_to: ["source"]
      proportion_exponent: ["1"]

Results

chain_trick

Top: Simulation of the SIR model; behavior is as expected. Dashed black line: expected final epidemic size in an SIR model for R0 = 3.75.

Bottom: Simulation of the SI2R model; behavior not as expected. Dashed black line: expected final epidemic size in an SIR model for R0 = 3.75. Dotted black line: expected final epidemic size in an SIR model for R0 = 0.5*3.75.

The SI2R model behaving as if the reproduction number is half of what is supposed to be as a clear indication that the FOI is computed as,

dS / dt = beta * S * I1

or,

dS / dt = beta * S * I2

instead of,

dS / dt = beta * S * (I1 + I2)

Are there any obvious mistakes in the config files? If not, we'll have to address this.

To Reproduce

reproduce_error.zip

Environment, if relevant

Mac OS, FLEPIMOP-ENV, main GitHub branch

twallema commented 1 month ago

Confirming that this issue is fixed by adding two pairs of brackets around the summed compartments "I1" and "I2", i.e. changing,

  transitions:
    - source: ["S"]
      destination: ["I1"]
      rate: ["beta"]
      proportional_to: ["source", ["I1","I2"]]
      proportion_exponent: ["1", "1"]

into,

  transitions:
    - source: ["S"]
      destination: ["I1"]
      rate: ["beta"]
      proportional_to: ["source", [[["I1","I2"]]]]
      proportion_exponent: ["1", "1"]

yields,

chain_trick

Further crossreffing issues #298 and #301 for this model,

    - source: ["I1"]
      destination: ["I2"]
      rate: ["2/T_I"]
      proportional_to: ["I1"]  # --> will not work; must be equal to "source": disease states with length > 1 in `proportional_to` get cut off by gempyor.
      proportion_exponent: ["1"]
twallema commented 1 month ago

reproduce_error_fixed.zip

twallema commented 1 month ago

@saraloo To reiterate for anyone following along - the brackets should be (this syntax is so confusing to describe haha but hopefully/maybe this makes sense??):

proportional_to: 
[
[source rates],
[destination rates]
]

breaks down into

proportional_to: 
[
[source rates],
[[level of stratification defined in compartments]]
]

breaking down again into

proportional_to: 
[
[source rates],
[[[sum over these compartments within that stratification]]]
]

and the rule that brackets are required if there is no broadcasting, but no brackets if rates are broadcast, applies here (this is probably very confusing but tldr; there should be 3 levels of brackets if summing within a compartment stratification level, even if there is just the one comparment stratification level)