european-modelling-hubs / covid19-forecast-hub-europe

European Covid-19 Forecast Hub.
https://covid19forecasthub.eu
Other
45 stars 95 forks source link

Update validation scripts #11

Closed sbfnk closed 3 years ago

sbfnk commented 3 years ago

living here - also taking note of https://github.com/reichlab/covid19-forecast-hub/issues/2594

sbfnk commented 3 years ago

For the Germany/Poland hub especially https://github.com/epiforecasts/covid19-forecast-hub-europe/blob/main/code/validation/validate_filenames.py and https://github.com/epiforecasts/covid19-forecast-hub-europe/blob/main/code/validation/test-formatting.py

sbfnk commented 3 years ago

See here for the US version: https://github.com/reichlab/covid19-forecast-hub/wiki/Data-Validation (Germany/Poland version diverged in June)

sbfnk commented 3 years ago

Should include a check of the filename to match the submission day.

kathsherratt commented 3 years ago

Copying over here from slack channel discussion

Only the following needs to be changed in the validation scripts

  • The FIPS_CODES dict in covid19-forecast-hub-de/code/validation/covid19.py
  • the list COUNTRIES in covid19-forecast-hub-de/code/validation/test-formatting.py

We will need to decide which standard of country codes for some of this - currently TBC

kathsherratt commented 3 years ago

@sbfnk , re. comment above to include a filename check to match submission day. The current validation includes that the filename matches the forecast_date column (function filename_match_forecast_date() in test-formatting.py).

Is this what you meant, or did you intend a check that the filename matches the date the PR is made (or is within the Thu-Mon range of accepted dates each week)?

sbfnk commented 3 years ago

in the US they are working on "a check for submissions that come in with a date in the filename (i.e. the forecast_date ) not equal to today or today+1 . This will throw a warning and in general [they] will ask teams to resubmit their files with an updated forecast_date and possibly updated targets in their file." - this seems like a good idea. I think @nikosbosse has some views on this, too.

nikosbosse commented 3 years ago

I have a very strong preference to force any submissions to have the date be exactly the last day of the submission period. This makes it very easy and unambiguous to find out for which week something was submitted and we can use the filename as submission date without having to do any fiddling.

We could think about allowing people to submit other things as well (if people want to use the repo as a public verification or something like that), but for the weekly forecasts I think that saves us some trouble down the line. Then if people submitted something on Saturday and something on Sunday, we know exactly which file is the right one because the filename is equal to the last day of the submission period.

seabbs commented 3 years ago

I agree.