tapios / risk-networks

Code for risk networks: a blend of compartmental models, graphs, data assimilation and semi-supervised learning
Other
2 stars 2 forks source link

Epidemic Data Storage with example #110

Closed odunbar closed 4 years ago

odunbar commented 4 years ago

With regards to 'decoupling the data assimilation loop' in Issue #89 , and Issue #83

I provide 2 Classes and a simple example in view of storing and accessing contact networks, and status data by start or end time. This should greatly simplify the indexing required in DA schemes.

In epidemic_data_storage.py 1) StaticIntervalData class stores a static contact network, a start time, an end time, and one can also set data (statuses dictionary) at the start time or end time 2) StaticIntervalDataSeries stores a dictionary of StaticIntervalDatas, one can access any StaticIntervalData by providing a start time or end time.

In saving_contact_networks.py i have a simple save and load scenario for a series of networks and statuses.

To demonstrate it's function: assume we have saved a series epidemic_data_storage and we are interested in loading a static network called ab_network fixed for a time interval [a,b]. When running forwards in time our current simulation time on loading this network will be time a. We can load our network by: ab_network = epidemic_data_storage.get_network_by_start_time(a)

if however we were running backwards in time our current simulation time on loading this network would be time b. We can then load our network by: ab_network = epidemic_data_storage.get_network_by_end_time(b)

I have added more recently an example saving_contact_networks_kinetic_and_master.py which first runs an epidemic, and saves data. Then afterwards, loads data and runs the risk simulator.

(For completeness, I also have an example saving_contact_networks_ensemble_kinetic_and_master.py which runs and ensemble of epidemics, and ensembles of risk simulations (one for each epidemic) then plots the mean epidemic and mean risk - though this is merely to reproduce tests requested in the PR comments)

odunbar commented 4 years ago

Added new example where we first run an epidemic for 30 days, save the network information. then forward run the master equations for 30 days. Results: kinetic_and_master

tapios commented 4 years ago

Looks pretty good. Could you please run an ensemble of kinetic simulations, to make sure the ensemble mean tracks the ME solution reasonably well?

odunbar commented 4 years ago

@tapios This actually is not going to be so easy, as different realizations of the kinetic model will lead to different hospitalizations and therefore different contact networks. We can try running an epidemic without the health_service and then it makes more sense to average over different samples.

tapios commented 4 years ago

@tapios This actually is not going to be so easy, as different realizations of the kinetic model will lead to different hospitalizations and therefore different contact networks. We can try running an epidemic without the health_service and then it makes more sense to average over different samples.

In that case, running parallel kinetic and master equation ensembles would make sense. Presumably, the noise in the master equations (from the network) will be much less.

odunbar commented 4 years ago

kinetic_and_master_ensemble

A poor man's version of the experiment you have requested... but anyway after 4 runs (same parameters and ICs), we can see how large the kinetic model variances are. The master equations seem more robust as expected.

tapios commented 4 years ago

Might be good to average the ME and kinetic ensembles. Hard to see much on these plots. It looks like infections in the kinetic simulations rise more slowly—is this perhaps because of the closure issues? (Is there any closure beyond the mean-field approximation in the ME model in these simulations?)

odunbar commented 4 years ago

@tapios (also @agarbuno if you are interested). This is the plot for the mean values of 10 ensemble runs for 100 days of the kinetic (X) and master equations (--) (for the 1000 node model). There are discrepancies but I think these deviations are present even without the save-data feature of this PR- we will need to test further, but i feel this is not for this PR. mean_kinetic_and_master

PS on my laptop it takes perhaps 25mins per 100day run (epidemic_model + master equations). PPS this looks identical without the closure too. PPPS Here is the same experiement with a 6 hour static contacts window, instead of 3. mean_kinetic_and_master

odunbar commented 4 years ago

All changes are now resolved and conflicts resolved with #111 . I shall merge now