tapios / risk-networks

Code for risk networks: a blend of compartmental models, graphs, data assimilation and semi-supervised learning
Other
2 stars 2 forks source link

10^5 network setup #190

Closed odunbar closed 3 years ago

odunbar commented 3 years ago

Adds features to enable the 10^5 network to run in feasible timescales

  1. Run script run_intervention_scenario_parallel.sh for epidemic with social distancing at t=18 days. duration 90 days. 5 day DA windows.
  2. Epidemic data storage deletes any contact networks older than da_window + prediciton_window days
  3. Closure is evaluated using a (numba boosted) loop over nonzero elements of master_eqn_ensemble.L a.k.a w_ij which means no 10^5 x 10^5 (80GB) matrices are built. NB we use a scipy.sparse.coo_matrix() for this construction, and after convert to scipy.sparse.csr_matrix() as recommended by the scipy.sparse docs.
  4. Plotting (epidemic.png, and epidemic_and_master_eqn.png), saving ensemble mean state and saving full kinetic state to .npy files at regular intervals.
  5. Run scripts for experiments for paper run_expt_userXXX_XXXsensors.sh

Current times for a single stage of the contact network, 1/8th of a day (across 32 cores):

Other fixes:

co-authored with @glwagner

dburov190 commented 3 years ago

Hey guys, I reviewed the commit -- well, as much as I could... I didn't look at the various sbatch scripts, and didn't look at joint_epidemic_assimilation.py because, honestly, it's a mess now and it would take me a while to check everything carefully. Also, I didn't look into the ENKF file -- I'm trusting you guys!

So, only cosmetic changes from me, basically removing redundant comments and fixing style here and there. There was one thing that might warrant discussion: I changed the state_size in the EnsembleTimeSeries to n_vector; I didn't really like the state_size since we're not only using that container for states, we're also using it for other things, so it doesn't reflect the purpose. Given that Greg and Ollie didn't like n_array and n_data, I thought that maybe n_vector is okay; but this is open for discussion. Also, changed the update_batch in the same file to n_roll_at_once as that seems more descriptive (batches are used in a different context in assimilator, so I was a bit confused at first).

It's a go from me! Let's merge this one ugly... PR, and move on! :D