InstituteforDiseaseModeling / covasim

COVID-19 Agent-based Simulator (Covasim): a model for exploring coronavirus dynamics and interventions
https://covasim.org
MIT License
255 stars 224 forks source link

Vaccination affects random seed #373

Closed cmerenstein closed 3 years ago

cmerenstein commented 3 years ago

Describe the bug Hello,

I am trying to run multiple covasim simulations with different vaccination schemes to test their impact. I would like each simulation to be identical up to the point when vaccinations begin, at which point I would expect them to diverge. I set the seed for the simulations to be the same, but they do not progress in identical manner even before the vaccinations start.

How can I get each simulation to be completely identical except for who gets vaccinated?

To reproduce

I made a colab notebook that can reproduce, or the code below should work as well. https://colab.research.google.com/drive/1ugfrWu1sz_2xQ5E6ThyxPMdM6WIPU-aG?usp=sharing

For model issues:


import numpy as np
import covasim as cv
import sciris as sc

def prioritize_by_age(people):
    return np.argsort(-people.age)

def rand_priority(people):
    return np.random.permutation(people.uid)

def inorder_priority(people):
    return people.uid

def reverse_priority(people):
    return np.argsort(-people.uid)

np.random.seed(19143)

pars = {
    'beta': 0.015,
    'n_days': 120,
}

## make dictionary to match name to priority
name_key = {"random": rand_priority, "age":prioritize_by_age,
            "inorder":inorder_priority, "reverse":reverse_priority}
sims = []

## try simulations with a variety of vax schemes
for vax in ["age", "age", "random", "random"]: ## run each one twice, to make sure it's internally consistent

  ## get the priority function
  intervention = name_key[vax]

  ## get the vax function
  pfizer = cv.vaccinate_num(vaccine='pfizer', num_doses={20:5000},
                                      sequence = intervention, 
                                      line_args = sc.mergedicts(dict(linestyle='--', c='#bbbbbb', lw=1.0)))
  ## Make simulation
  sim = cv.Sim(
      use_waning=True,
      pars=pars,
      label = vax, ## label with vax name
      interventions=pfizer,
      rand_seed = 19143
  )
  ## Set sim seed
  sim.set_seed(19143)
  sims.append(sim)

## Run simulations
for sim in sims:
  sim.run(verbose = False)
  print("Vax priority: ", sim.label)
  print("exposed, first 10 days: ", [sum(sim.people.date_exposed < x) for x in range(0, 10)])
  print("vax, first 10 days: ", [sum(sim.people.date_vaccinated < x) for x in range(0, 10)])

  print("\n ------------------------------------\n")

Expected behavior

I expected that each simulation would have identical numbers of people exposed on each of the first 10 days, because the vaccination doesn't start until day 20. This was not the case though, we see that a different number of people are infected between the "random" and "age" simulations, even though there should be no difference between the two simulations in days 0-10.

Screenshots or outputs Here is an image of the output that I get when running the above code.

image

Platform (please complete the following information): Google Colab

cliffckerr commented 3 years ago

Thanks for noting this @cmerenstein ! We'll take a look in the next couple days.

cliffckerr commented 3 years ago

Hi @cmerenstein , apologies for the slow reply! It turns out there's a simple explanation for this -- using sim.people.date_exposed is not reliable when you have waning immunity, since if a person becomes reinfected, it refers to the most recent date of exposure, not the first, which means early dates of exposure can get overwritten. If instead of

  print("exposed, first 10 days: ", [sum(sim.people.date_exposed < x) for x in range(0, 10)])

you use

  print("exposed, first 10 days: ", [sim.results.cum_infections[x] for x in range(0, 10)])

then it should work as expected. Please let us know if you're still having issues!

cmerenstein commented 2 years ago

Thanks for the help!

This does seem to be the issue, but cum_infections doesn't really have all the same information I'm looking for. In the minimal example I posted it doesn't matter much who is getting infected, just the #, but in my actual use case the population is split into different clusters, and I want to know cumulative infections within each cluster, which is why I needed to get the result from sim.people.

Is there some way to see the first time a person was infected, or all of the dates at which a person was infected, that won't be changed by later infections?

Thanks again!