Open AFg6K7h4fhy2 opened 1 month ago
NOTE: Clicked edited
above to see earlier versions or corrections to the below diagram.
Possible pipeline:
%%{init: {"theme": "neutral", "themeVariables": { "fontFamily": "Iosevka", "fontSize": "25px", "lineColor": "#808b96", "arrowheadColor": "#808b96", "edgeStrokeWidth": "10px", "arrowheadLength": "20px"}}}%%
flowchart TD
A1[COVID-19 Data _from forecasttools_] --> A4[NumPyro Model]
A2[Influenza Data _from forecasttools_] --> A4[NumPyro Model]
A3[External Dataset] --> A4[NumPyro Model]
A4[NumPyro Model] -->|_arviz.from_numpyro_| A5[Forecast As InferenceData Object wo/ Dates]
A5[Forecast As InferenceData Object wo/ Dates] -->|_Add Dates To InferenceData_ - done| A6[InferenceData Object w/ Dates]
A6[InferenceData Object w/ Dates] -->|_Convert To Tidy-Like Dataframe_ - done| A7[Polars Forecast Dataframe w/ Draws]
A7[Polars Forecast Dataframe w/ Draws] -->|_Convert To Hubverse Formatted Dataframe_ - done| A8[FluSight Submission Dataframe]
A7[Polars Forecast Dataframe w/ Draws] -->|_Convert To ScoringUtils Formatted Dataframe_ - in progress| A9[ScoringUtils DataFrame]
A7[Polars Forecast Dataframe w/ Draws] -->|_Save_| A10[Parquet File]
A8[FluSight Submission Dataframe] -->|_Save_| A11[Parquet File]
A9[ScoringUtils DataFrame] -->|_Save_| A12[Parquet File]
A8[FluSight Submission Dataframe] -->|_Convert To ScoringUtils Formatted Dataframe_ - in progress| A9[ScoringUtils DataFrame]
A12[Parquet File] -->|_Get scores in R_| A13[Forecast Scores]
A11[Parquet File] -->|_Model Forecast Hypothesis Testing_| A14[Model Comparison Report]
B1[Pulled Parquet Hubverse Submissions] -->|_Model Forecast Hypothesis Testing_| A14[Model Comparison Report]
linkStyle default stroke: #808b96
linkStyle default stroke-width: 2.0px
@dylanhmorris Would appreciate feedback on this (possibly you including your mental model of the workflow as another diagram). Also, are the arrows visible on your GitHub Appearance? It worked for me on high contrast white background but not on another setting.
I can see how the Convert To ScoringUtils Ready DataFrame can come from some intermediate step involved in Convert To FluSight Submission.
@SamuelBrand1 Would appreciate a check in on this as well, Sam.
The author will flesh out this comment more during the Spring [November 11, November 22] and is simply adding what exists here as a placeholder and so as not to lose any writing.
Both comments https://github.com/CDCgov/forecasttools-py/issues/16#issuecomment-2432415729 and https://github.com/CDCgov/forecasttools-py/issues/16#issuecomment-2432550848 still stand unaddressed.
Some thoughts: I believe forecasttools-py
can come to facilitate aspects of pre- and post-processing in the Real Time Monitoring (hereafter RTM) branch's pipelines. Presently, the utilities offered by forecasttools-py
cover narrow needs of the Short Term Forecasts team's workflows. These workflows include formatting NumPyro forecast model output into Hubverse's submission format. At present, pyrenew-hew
has utilities for formatting parts of az.InferenceData
as being ready for tidy_draws
(and spread_draws
) in tidybayes
and for making use of R's scoringutils
. There are changes that can be made in forecasttools
to require of the user writing as little post-processing (forecast scoring) code as possible. #36 and #9 exist in this regard.
Do something akin to the following for
forecasttools
: