CDCgov / wastewater-informed-covid-forecasting

Wastewater-informed COVID-19 forecasting models submitted to the COVID-19 Forecast Hub
https://cdcgov.github.io/wastewater-informed-covid-forecasting/
Apache License 2.0
44 stars 8 forks source link

Remove dependency on `pcr_target_flowpop_lin` #137

Closed kaitejohnson closed 3 months ago

kaitejohnson commented 3 months ago

This PR is a somewhat quick fix for the fact that our wweval pipeline still depends on some functions in cfaforecastrenewalww that were written during early stages of development. One of them is init_subset_nwss_data() which selects the columns from the NWSS data and filters/transforms as needed.

In practice, the downstream modeling pipeline only relies on "raw" inputs that submitters submit to NWSS, not calculated transformed variables such as pcr_target_flowpop_lin. But when we were developing, we were considering an option that fit to flow population normalized genome copies per person. We have since removed that functionality in the model downstream, and are planning to build a pipeline that is dependent only on data that is submitted to NWSS, for both our eventual production pipeline and the planned in platform DCIPHER tooling.

This is therefore a quick fix to remove any dependency on a transformed variable in the dataset, which @hannahcohen4 ran into an issue with when testing out the model run on an example raw submitted dataset.