This sounds like potentially it is due to the internal anomaly handling in EpiNow2 but I am not totally convinced as all that is is setting days with 0 cases to a local average.
I am not sure a split out package is required to handle data only processing? Though potentially for some of the processing tasks.
We need:
Raw data downloading and storage in a dated csv file.
A single processing of data from state level to US level that is then used everywhere.
Raw data processing with a basic anomaly correct (i.e something like set to moving average if outside 3 standard deviations in the data)
Flag on when anomaly correction has occurred and some kind of reporting process for this.
Use anomaly corrected dated data for all downstream modelling work.
This sounds like potentially it is due to the internal anomaly handling in
EpiNow2
but I am not totally convinced as all that is is setting days with 0 cases to a local average.I am not sure a split out package is required to handle data only processing? Though potentially for some of the processing tasks.
We need:
Anomaly correction in
EpiNow2
.https://github.com/epiforecasts/EpiNow2/blob/d2b2aa6e76190000d5aad37e66f132f7c44d4644/R/create.R#L34
Originally posted by @seabbs in https://github.com/epiforecasts/covid-us-forecasts/issues/96#issuecomment-783311147