Open damonbayer opened 1 month ago
Here is what I am thinking for the necessary scripts (gray rectangle) and the outputs (blue rounded rectangles). Open to any feedback on this design. Scoring is probably not really part of the production pipeline, but I've included it anwyay.
Some question I can predict and answer:
What file format will we use for the tabular data? csv or parquet
Why is there a separate output of "all MCMC draws" as a tabular file? I have not found the R packages for working with netCDF data to be very friendly. I have not looked into using zarr yet.
Why not directly use the Posterior MCMC draws for ArviZ diagnostics We could, but I think it is probably more user-friendly to import a netCDF and we don't want to create an additional netCDF after fitting the model.
Why use R at all?
Since we have to use R to use scoringutils
, we might as well take advantage of the CFAEpiNow2Pipeline and ggdist packages.
@dylanhmorris @AFg6K7h4fhy2 open to comments and questions
Current version has received a LGTM from @dylanhmorris in a Teams discussion.
@damonbayer This looks really good! Just a few questions.
Thanks @kaitejohnson. I confess I haven't thought much about these questions until you asked them.
Are you planning on implementing this via a make file?
This will all be done in azure, which, I think, is not really related to make. Perhaps you mean something more general or I am misunderstanding how things work.
What format will the Arviz Diagnostics be in? Is this a pdf with the calibration and forecasts for each state + convergence diagnostics?
I think there will be a short automated report with all the forecasts and a more detailed one with the forecasts and typical mcmc diagnostics (rhat, ess, etc.) Adding calibration and other diagnostics specific to the task at hand would be good too. I think this will be an html file generated in quarto?
Will you have any diagnostic flags implemented? Hadn't thought about this yet. It would be good to lean on your experience from last season.
Will this be easily extendable to fitting multiple model types e.g. pyrenew with and without wastewater? That's the plan, but I can't say much more until I have more experience running it for a single model.
Where do the azure self-hosted runners come in? (just in the Fit model section)? Unclear. I need hear more about how NNH is using them.
This will all be done in azure, which, I think, is not really related to make. Perhaps you mean something more general or I am misunderstanding how things work.
I guess what I mean is there any plan to use a pipelining tool of some sort that will cache different steps. So that you can make adjustments (e.g. excluding certain data points) and rerun the pipeline with only the pieces downstream getting updated (this is what we used targets for last year and despite it not playing nicely with azure it was really convenient for automating pipeline outputs)
Like the idea of an HTML file generated in quarto --I found it really helpful to have a few things to review in one place as a post model run pre send-off step to spot check each location
Hadn't thought about this yet. It would be good to lean on your experience from last season.
Per usual, I took NNH's lead on this and used their thresholds for rhat, divergences, EMBFI, etc. which are now defaults in the wastewater package's model flags. https://github.com/CDCgov/ww-inference-model/blob/9dd766b8da3cd661f7daeb5f6f6127786e4db5ec/R/model_diagnostics.R#L51 I don't think you need these exactly but having some flags is helpful to know where to look in real-time.
I guess what I mean is there any plan to use a pipelining tool of some sort that will cache different steps. So that you can make adjustments (e.g. excluding certain data points) and rerun the pipeline with only the pieces downstream getting updated (this is what we used targets for last year and despite it not playing nicely with azure it was really convenient for automating pipeline outputs)
Seems like a good question for @dylanhmorris. I think adding functionality to kick off one script from another is trivial (e.g. if you already fit the models, but want to change the forecast horizon, you could kick off the pipeline starting at the forecasting step). Maybe there are more sophisticated concepts in azure that could make this easy to implement.
@kaitejohnson I have updated the diagram with the quarto reports idea.
@dylanhmorris I have updated the diagram based on our conversation about a step to tidy arviz data into a tabular format
Hey @damonbayer I think the quantities of interest box should flow into the scoring box rather than the MCMC draws. The reason being that that includes our observables from the retro data.
TBF I can imagine also scoring all parameters (e.g. when fitting on generated data).
From f2f discussion it was pointed out that "quantities of interest" doesn't mean "generated quantities", it means summary statistics.
@SamuelBrand1 @dylanhmorris @AFg6K7h4fhy2 I have updated the diagram based on our f2f discussion. Please thumbs up this comment if it appears accurate or comment if it does not.
It has come to my attention that there is some ambiguity around the "Tidy (Python using forecasttools)" step. My intention is that this is (probably) several parquet
files (one for each InferenceData group) that adhere to the tidybayes::tidy_draws
format:
A data frame (actually, a tibble) with a .chain column, .iteration column, .draw column, and one column for every variable
This will always contain the most up-to-date draft of the pipeline.