epiforecasts / covid-us-forecasts

Forecasting Covid-19 in the US
https://epiforecasts.io/covid-us-forecasts/submissions/report.html
MIT License
8 stars 3 forks source link

Data handling #97

Open seabbs opened 3 years ago

seabbs commented 3 years ago

This sounds like potentially it is due to the internal anomaly handling in EpiNow2 but I am not totally convinced as all that is is setting days with 0 cases to a local average.

I am not sure a split out package is required to handle data only processing? Though potentially for some of the processing tasks.

We need:

  1. Raw data downloading and storage in a dated csv file.
  2. A single processing of data from state level to US level that is then used everywhere.
  3. Raw data processing with a basic anomaly correct (i.e something like set to moving average if outside 3 standard deviations in the data)
  4. Flag on when anomaly correction has occurred and some kind of reporting process for this.
  5. Use anomaly corrected dated data for all downstream modelling work.
  6. Use raw dated data for plotting.

Anomaly correction in EpiNow2.

https://github.com/epiforecasts/EpiNow2/blob/d2b2aa6e76190000d5aad37e66f132f7c44d4644/R/create.R#L34

Originally posted by @seabbs in https://github.com/epiforecasts/covid-us-forecasts/issues/96#issuecomment-783311147

seabbs commented 3 years ago

@nikosbosse @kathsherratt @sbfnk