Potential discrepancy in epidemic sizes

pratikunterwegs commented 6 months ago

Pasted from Slack:

I’m trying to think through epidemic size expectations in model scenarios that differ only in their transmission rate (or R, if you prefer). I would expect that for an identical number of initial infections, the ratio of epidemic sizes at time t for two different transmission rates beta and beta2 would be ((1 + beta) / (1 + beta2))^t. I’m clearly missing something here, as this isn’t borne out in the reprex below. What am I missing?

Useful reference: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1766383/

library(epidemics)
contact_matrix <- matrix(1)
demography_vector <- c(10e6)

# make initial conditions - order is important
initial_conditions <- matrix(
  c(S = 1 - 1e-6, E = 0, I = 1e-6, R = 0, V = 0),
  nrow = 1, byrow = FALSE
)

# create a population
pop <- population(
  contact_matrix = contact_matrix,
  demography_vector = demography_vector,
  initial_conditions = initial_conditions
)

beta = 1.3 / 7
beta2 = 0.8 * (1.3 / 7)
time = 100

# with regular beta
data <- model_default(
  pop, transmission_rate = beta,
  infectiousness_rate = 1,
  time_end = 100
)
new_infections(data)
#> Key: <time, demography_group>
#>       time demography_group new_infections
#>      <num>           <char>          <num>
#>   1:     0     demo_group_1       0.000000
#>   2:     1     demo_group_1       1.769585
#>   3:     2     demo_group_1       1.722958
#>   4:     3     demo_group_1       1.750475
#>   5:     4     demo_group_1       1.803541
#>  ---                                      
#>  97:    96     demo_group_1      50.898345
#>  98:    97     demo_group_1      52.780868
#>  99:    98     demo_group_1      54.732965
#> 100:    99     demo_group_1      56.757202
#> 101:   100     demo_group_1      58.856242

# with 20% reduced beta
data2 <- model_default(
  pop, transmission_rate = beta2,
  infectiousness_rate = 1,
  time_end = 100
)
new_infections(data2)
#> Key: <time, demography_group>
#>       time demography_group new_infections
#>      <num>           <char>          <num>
#>   1:     0     demo_group_1       0.000000
#>   2:     1     demo_group_1       1.409427
#>   3:     2     demo_group_1       1.341663
#>   4:     3     demo_group_1       1.323585
#>   5:     4     demo_group_1       1.321985
#>  ---                                      
#>  97:    96     demo_group_1       2.083332
#>  98:    97     demo_group_1       2.093725
#>  99:    98     demo_group_1       2.104169
#> 100:    99     demo_group_1       2.114665
#> 101:   100     demo_group_1       2.125214

# ratio of epidemic sizes at time = 100
epidemic_size(data) / epidemic_size(data2)
#> No 'dead' compartment found in `data`; counting only 'recovered' individuals in the epidemic size.
#> No 'dead' compartment found in `data`; counting only 'recovered' individuals in the epidemic size.
#> demo_group_1 
#>      7.64958

# ratio of new infections
tail(new_infections(data)[["new_infections"]], 1) /
  tail(new_infections(data2)[["new_infections"]], 1)
#> [1] 27.69427

# the expected ratio of two exponential processes with different rates
# not similar to ratio of epidemic sizes
ratio = ((1 + beta) / (1 + beta2))^100
ratio
#> [1] 24.1104

^{Created on 2024-03-22 with reprex v2.0.2}

adamkucharski commented 6 months ago

I think the problem is that it's fiddlier to derive growth rate from R for SEIR models (equation 3.2 in above paper), and hence project forward with a simple (1+r)^n function.

Instead, I had a go at estimating empirical growth rate from the simulation and comparing with the SIR growth rate from above paper, and looks like theory and simulation match OK (noting I approximate the empirical calculation with an exponential). I also put some people in the E compartment to remove the initial transient dynamic and made $R$ larger to make dynamics clearer...

initial_conditions <- matrix(
  c(S = 1 - 2e-6, E = 1e-6, I = 1e-6, R = 0, V = 0),
  nrow = 1, byrow = FALSE
)

# .....

recovery_rate <- 1/7
infectiousness_rate <- 1/7
beta = 2*recovery_rate
beta2 = 0.8 * 2 * recovery_rate
time_max = 100

# .....

# convert reproduction number to SEIR growth rate (from Wallinga & Lipsitch)
convert_R_to_r <- function(R,infectiousness_rate=1/7,recovery_rate=1/7){
  0.5*(-(infectiousness_rate+recovery_rate)+sqrt((infectiousness_rate+recovery_rate)^2 - 4* infectiousness_rate*recovery_rate*(1-R)))
}

# function to calcualte empirical growth rate
empirical_growth_rate <- function(x){
  log(x[100]/x[51])/50
}

growth1_empirical <- empirical_growth_rate(new_infections(data)[["new_infections"]])
growth2_empirical <- empirical_growth_rate(new_infections(data2)[["new_infections"]])

growth1_empirical/growth2_empirical
#> 1.562317

# calculate daily growth rates
growth1 <- convert_R_to_r(2,recovery_rate)
growth2 <- convert_R_to_r(0.8*2,recovery_rate)

growth1/growth2
#> 1.563595

pratikunterwegs commented 6 months ago

Thanks a lot - that's really helpful! Keeping this issue open for future reference.

epiverse-trace / epidemics

Potential discrepancy in epidemic sizes #208