tapios / risk-networks

Code for risk networks: a blend of compartmental models, graphs, data assimilation and semi-supervised learning
Other
2 stars 2 forks source link

New ContactSimulator #88

Closed glwagner closed 4 years ago

glwagner commented 4 years ago

This PR introduces a new ContactSimulator.

The new contact simulator performs independent simulations for each contact. This requires a Gillespie algorithm that is valid for time varying rates. This is implemented using theory and a numerical algorithm inspired by

Christian L. Vestergaard , Mathieu Génois, "Temporal Gillespie Algorithm: Fast Simulation of Contagion Processes on Time-Varying Networks", PLOS Computational Biology (2015)

https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004579

The algorithm solves an integral problem over the diurnal variation in the inception rate to determine the "waiting time", or time-to-inception in terms of a 'normalized step' drawn from the exponential distribution with unit rate.

In addition, and as a consequence of the new algorithm, we can now support independent inception rates for each contact. However, generic support of this functionality requires more than simply a new Gillespie algorithm.

The API implemented in this PR allows the user to set node-wise inception rates via the contact_network's node attributes

which are then used to calculate the contact specific inception rates prior to the simulation of contact durations over each static contact interval. In the overleaf these are called lambda_{i, max} and lambda_{i, min}.

This PR implements an algorithm for tracking the state of each contact during changes to the contact network's edges as a result of hospitalization. There are four state variables: active_contacts, event_time, overshoot_duration, and contact_duration. These four containers must be np.array to work with the numba'd contact simulation algorithm; therefore we cannot easily add or remove their entries when corresponding edges are added or removed due to hospitalization.

To solve this problem we use numba's TypedDict with keys corresponding to edge id as a reference ("edge_state") to identify edges with their current state. We can easily add or delete edges from this dictionary reference; and because TypedDict is ordered, we uniquely identify edges with array indices. The state variable arrays event_time, overshoot_duration, and contact_duration are re-initialized prior to runs that are affected by hospitalization, using a loop over all edges in the edge_state dictionary.

The state variable arrays active_contacts, event_time, overshoot_duration, and contact_duration are not re-allocated every run; instead, they are initialized with a buffer that must ensure they are large enough no matter how many contacts/edges are added to the contact network.

Changes may be needed to existing simulation examples. Currently, examples/simple_epidemic.py runs. examples/contact_simulation_example.py demonstrates the new contact network in action.

examples/changing_max_inception_rate.py demonstrates how. to change the node-specific inception rate. In that example, the line

nx.set_node_attributes(contact_network, values=5, name="day_inception_rate")

changes the daytime contact inception rate (lambda_max) on every node from its initial value (40) to 5. This occurs on day 5 of the contact simulation. The example produces this plot:

image

This PR also nukes many stale or redundant examples.

I'm curious how much faster or slower the simulations in this PR are relative to existing examples. While the Gillespie simulation is far faster than before, the pre-processing necessary to support hospitalization may slow things down somewhat. Preliminary tests suggest the slow down is not severe, however. But everything must be tested "in the wild."

Resolves #86 Resolves #61

lubo93 commented 4 years ago

Thanks a lot! I modified the NYC (no intervention) example and observed the following error:

image

This KeyError is similar to the one that also occurred in "Gillespie simple contagion" and which we solved with:

contact_network = nx.convert_node_labels_to_integers(contact_network)

Did something change in the definition of spontaneous/induced transition lists (maybe as a result of the changes in "contact simulator")?

The following seeds were used:

`seed = 2132

np.random.seed(seed) random.seed(seed)

seed_numba_random_state(seed)`

I also attach the relevant simulation file.

simulate_idealized_NYC_epidemic_nointervention.py.zip

glwagner commented 4 years ago

@lubo93 thanks for checking this out. I think I may have fixed the problem -- I was confused about whether we simulate the contacts of those who are deceased. We do in fact have to do this. I believe this is the source of your error (the KeyError is associated with a missing edge, probably because the edge weight was not set). Let me know if I have fixed your problem.

Is if the problem is what you're saying, then can you solve it within the script, without touching the source? The contact network is defined outside of the contact simulator or the epidemic simulator.

glwagner commented 4 years ago

btw there is a function for seeding all three random states with one seed:

https://github.com/dburov190/risk-networks/blob/9b82dabf263c82633c65870b509b76e1e03b4c87/epiforecast/utilities.py#L11

glwagner commented 4 years ago

@lubo93 feel free to commit your changes to this PR!

lubo93 commented 4 years ago

@lubo93 thanks for checking this out. I think I may have fixed the problem -- I was confused about whether we simulate the contacts of those who are deceased. We do in fact have to do this. I believe this is the source of your error (the KeyError is associated with a missing edge, probably because the edge weight was not set). Let me know if I have fixed your problem.

Is if the problem is what you're saying, then can you solve it within the script, without touching the source? The contact network is defined outside of the contact simulator or the epidemic simulator.

Ok, I pushed my example (..._nointervention.py) and a new plotting script. The previous KeyError didn't occur anymore. Now I get the following error:

image

and another one:

image

Maybe some edges aren't updated?

glwagner commented 4 years ago

@lubo93 can we combine the NYC examples? I think it would be easier to understand what these scripts do by adding a flag intervention = True (or False), along with an if statement that determines whether a social distancing intervention is used or not. Currently both scripts are long but have almost identical code.

lubo93 commented 4 years ago

@lubo93 can we combine the NYC examples? I think it would be easier to understand what these scripts do by adding a flag intervention = True (or False), along with an if statement that determines whether a social distancing intervention is used or not. Currently both scripts are long but have almost identical code.

Yes, let's do that after finalizing the example? I still wanted to check if the examples work well for larger networks.

glwagner commented 4 years ago

@lubo93 it will save us some work, however, if we only have to update one of the scripts to match the new syntax.

I think I understand the issue. I suspect it occurs when two patients connected to one another are admitted. In this case, the algorithm attempts to sever the same contact twice.

We can simply pass this error --- if an edge does not exist, there's no need to remove it...

glwagner commented 4 years ago

@lubo93 I have a fix for the problem. It is brute force: I pass the error. I am also normalizing the edges now so that they are unique.

The latest error occurred when two connected patients became hospitalized. The HealthService gives us community contacts to remove, but these do not exist when they connect to a current patient.

I think we want a more robust solution to this problem for the future. I will raise an issue when we merge and we can discuss. For now, I think the simulations should run without error.

glwagner commented 4 years ago

As a side note, the contact simulation is about half the cost of the kinetic model simulation now. This is not negligible but may be better than before.

lubo93 commented 4 years ago

What shall we use for "start_run" in EpidemicSimulator? It's currently not defined:

print("\n(Epidemic simulation took {:.3f} seconds.)\n".format(end_run-start_run))

lubo93 commented 4 years ago

Ok, I added "start_run" in epiforecast. Time to merge. Thanks!