CDCgov / cfa-viral-lineage-model

Apache License 2.0
10 stars 0 forks source link

YAML-run pipeline, part 1 (modeling/eval) #25

Closed thanasibakis closed 2 months ago

thanasibakis commented 2 months ago

Major changes

linmod.models

Numpyro model functions are now contained in a class with corresponding model-specific data preprocessing and sample postprocessing. This builds on Scott's work in #14; it was just easier to reimplement here than handle the rebasing on main.

These model classes have three methods:

exploration/demo/ --> present_day_forecasting/

The demo has become pretty sophisticated over time and is close enough to our target goal of tooling to move to the root of the repo.

Because of the refactoring done here for linmod.models, I think we're able to merge the fitting scripts for all models, as well as evaluation and visualization scripts, into one (not so large) main.py; feel free to disagree with me here, though!

This main.py naturally has some behaviors we want to configure, so it accepts a path to a YAML file as a commandline argument. I haven't put any default behavior handling here, so all keys must be in this YAML.

See the present_day_forecasting/README.md for how to run it.

afmagee42 commented 2 months ago

One assumption that we've been hard-coding and need to make configurable is that we want to model all lineages. (I will accept this being ruled out of scope of this PR, but for the July data I'm playing with, the hierarchical model shows pretty clearly that most of the lineages have negligible proportions.)