starsimhub / starsim

Starsim disease modeling framework
http://starsim.org
MIT License
14 stars 8 forks source link

Unstable Vital Dynamics #695

Open daniel-klein opened 3 days ago

daniel-klein commented 3 days ago

Consider a simulation with the following demographics

dems = [
    ss.Births(pars=dict(birth_rate=20)),
    ss.Deaths(pars=dict(death_rate=20))
]

One might expect a stable population. However, the result is a dramatically declining population.

image

This result is for two reasons:

  1. Births takes the floor when deterministically adding agents. n_new = int(np.floor(sim.people.alive.count() * scaled_birth_prob))

  2. Module order matters. Because Births comes before Deaths in the list of dems, agents are added to the population (resulting in a larger population), and then Deaths applies the death rate to all agents, including the new agents. A 1% increase followed by a 1% decrease results in a net decline.

Switching the order of demographic module updates (and changing the Births floor to something like sc.randround) resolves the issue. Why? Because Deaths simply schedules agents for death via ti_death without actually removing them from the population. The Births module then adds agents considering the same denominator as used for deaths.

image

We cannot expect users to know they have to put Deaths before Births to get a stable population.

Next Steps:

Thanks to Deven for raising this issue in the context of SEIRD in TBsim.

cliffckerr commented 3 days ago

We had discussed this issue previously here: https://github.com/starsimhub/starsim/issues/303

I agree with the logic that newborns have a nonzero mortality risk, so this makes sense. It's like how the population replacement rate is closer to 2.1 than 2.0 births per woman -- nonintuitive, but correct.

I'm also not sure I'd call the drop "dramatic": take a look at this:

import starsim as ss
import matplotlib.pyplot as plt

dems = [
    ss.Births(pars=dict(birth_rate=20)),
    ss.Deaths(pars=dict(death_rate=20))
]

sim = ss.Sim(demographics=dems, dur=30)
sim.run()

res = sim.results
plt.plot(res.timevec, res.n_alive)
plt.ylim(bottom=0, top=10_000)
plt.show()

image

vgdeven commented 3 days ago

Just curious, why not have a separation of labor between - calculation/drawing of transitions (eg. n_births, n_deaths etc.) in one step and then application of transitions (eg., ΔState = n_births-n_deaths) in other? That way all transitions will based on the same denominator?

cliffckerr commented 2 days ago

Good question! Two main reasons:

  1. Since different modules can have different timesteps, there isn't a single point at which the deltas can be collected, and another point at which they can be applied. (Births and deaths could even use different timesteps, although that would probably not be a wise choice :))
  2. The model logic is intended so that under "standard" conditions, things can take effect immediately. For example, on a single timestep, a person can be tested, test positive, be scheduled for treatment, receive treatment, and then be less likely to infect someone else (or die etc). If we collected all the deltas and then modified the states after, then each step would introduce a 1-step delay (e.g. if they are tested on day t, they couldn't start treatment until day t+1, and then this wouldn't impact their probability of death/infectiousness until day t+2).