cmu-delphi / covid-19-forecast

Production code for Delphi's COVID-19 case and death forecasts.
4 stars 8 forks source link

Validate `target_end_date`s, `geo_value`s in `format_predictions_for_reichlab_submission` #126

Open brookslogan opened 2 years ago

brookslogan commented 2 years ago

Validate target_end_dates: check that target_end_date matches the values we would calculate for it using forecast_date, ahead, and incidence_period columns.

Validate geo_values: specifically, fail on, or properly handle, this case:

library(dplyr)

## (This pcard1 is from the formatting tests)
pcard1 <- tibble(
  ahead = rep(c(1, 2, 3, 4), each = 5),
  geo_value = "pa",
  quantile = rep(c(0.1, 0.4, 0.5, 0.6, 0.9), 4),
  value = 1:20,
  forecaster = "a",
  forecast_date = as.Date("2020-01-02"),
  data_source = "source",
  signal = rep(c("confirmed_incidence_num", "confirmed_incidence_num",
                 "deaths_incidence_num", "confirmed_admissions_covid_1d"),
               each = 5),
  target_end_date = rep(as.Date(c("2020-01-09", "2020-01-16", "2020-01-23", "2020-01-06")), each = 5),
  incidence_period = c(rep("epiweek", 15), rep("day", 5))
)
class(pcard1) <- c("predictions_cards", class(pcard1))

## user tries to do part of the formatting operations (abbr -> fips) themselves
pcard2 = pcard1 %>% mutate(geo_value = evalcast:::abbr_2_fips(geo_value))
## and gets NA locations as a result:
out2 <- format_predictions_for_reichlab_submission(pcard1)
count(out2, location)
## # A tibble: 1 × 2
##   location     n
##   <chr>    <int>
## 1 42          20

## we instead would want an error, or output matching the following:
out1 <- format_predictions_for_reichlab_submission(pcard1)