Closed stephenturner closed 3 years ago
This needs to be reopened after looking into #26. The first step in this pipeline gets data, then begins modeling. There are arguments that let you pass source and granularity to the get_
functions, but there's a built-in assumption that you have US-level data -- if you pass granularity=state
, you'll now have a state
column. The join will need to be done by by = c("epiyear", "epiweek", "state")
(similar for county), and I imagine everything else may fail downstream.
Options/thoughts:
forecast_pipeline_usa
, remove the arguments for granularity, and close this issue again for now@stephenturner i might be oversimplifying ... but what if in the get_*
functions we changed the name of the column that contains state (our county) to be "location" ... and added "location" as "US" for the granularity = "national"
i think that would make the whole pipeline more flexible. might even be able to do the forecasting with a group_by(location) even for national forecasts ... i think?
so after thinking through this some more ... if we're adding other models / granularity this pipeline function could get unwieldy and not altogether that useful.
assigning this issue to myself to remove the function from the package. in its place i'll put together a pipeline script that will generate, prep, and validate a submission that (for now) includes national and state level TS
Let's write a single function that strings together everything needed for the forecast we want to produce. For now this will be usa, 1-4 wk ahead ideaths icases cdeaths. Much will be hard-coded and will need to change, and will be obviously open to change as we develop.