cmu-delphi / covidcast

R and Python packages supporting Delphi's COVIDcast effort.
33 stars 28 forks source link

`get_zoltar_predictions` fails using default arguments #589

Open nmdefries opened 1 year ago

nmdefries commented 1 year ago


get_zoltar_predictions fails during some very common-place calls, like with the default arguments (incidence_period = epiweek and all possible target signals), e.g.

> get_zoltar_predictions("CMU-TimeSeries", forecast_dates = "2022-07-18")
get_token(): POST:
get_resource(): GET:
get_resource(): GET:
[1] "Grabbing forecasts from Zoltar..."
Error: POST status was not 200. status_code=400, json_response=Invalid query. error_messages='["target with name not found. 
+        name=1 wk ahead inc hosp, valid names=['17 day ahead inc hosp', '17 wk ahead cum death', '17 wk ahead inc death', 
+        '18 day ahead cum death', '18 day ahead inc death', '2 wk ahead inc case', '101 day ahead inc hosp', ...

See for context.

This error is because of improper target construction. Hospitalizations are forecast on a daily basis (1 day ahead inc hosp). When the hosp signal is requested and epiweek incidence_period is selected (either when incidence_period = "epiweek" or when incidence_period = c("epiweek", "day"), the default value, which the function interprets as incidence_period = "epiweek" via match.args) we construct invalid weekly hospitalization targets.

The same error happens for cases and deaths if incidence_period is set to day.

Example calls

> get_zoltar_predictions("CMU-TimeSeries", forecast_dates = "2020-07-20", 
+        signal = c("confirmed_incidence_num", "deaths_incidence_num", "deaths_cumulative_num"), 
+        incidence_period = "epiweek")
get_token(): POST:
get_resource(): GET:
get_resource(): GET:
[1] "Grabbing forecasts from Zoltar..."
# A tibble: 4,992 × 10                                                                                                                              
   ahead geo_value quantile value forecaster     forecast_date data_source signal               target_end_date incidence_period
    0s<int> <chr>        <dbl> <dbl> <chr>          <date>        <chr>       <chr>                <date>          <chr>           
 1     1 al          NA       172 CMU-TimeSeries 2020-07-20    jhu-csse    deaths_incidence_num 2020-07-25      epiweek         
 2     1 al           0.01     46 CMU-TimeSeries 2020-07-20    jhu-csse    deaths_incidence_num 2020-07-25      epiweek         
 3     1 al           0.025    67 CMU-TimeSeries 2020-07-20    jhu-csse    deaths_incidence_num 2020-07-25      epiweek         
 4     1 al           0.05     83 CMU-TimeSeries 2020-07-20    jhu-csse    deaths_incidence_num 2020-07-25      epiweek         
 5     1 al           0.1     104 CMU-TimeSeries 2020-07-20    jhu-csse    deaths_incidence_num 2020-07-25      epiweek         
 6     1 al           0.15    117 CMU-TimeSeries 2020-07-20    jhu-csse    deaths_incidence_num 2020-07-25      epiweek         
 7     1 al           0.2     128 CMU-TimeSeries 2020-07-20    jhu-csse    deaths_incidence_num 2020-07-25      epiweek         
 8     1 al           0.25    137 CMU-TimeSeries 2020-07-20    jhu-csse    deaths_incidence_num 2020-07-25      epiweek         
 9     1 al           0.3     143 CMU-TimeSeries 2020-07-20    jhu-csse    deaths_incidence_num 2020-07-25      epiweek         
10     1 al           0.35    153 CMU-TimeSeries 2020-07-20    jhu-csse    deaths_incidence_num 2020-07-25      epiweek         
# … with 4,982 more rows
> get_zoltar_predictions("CMU-TimeSeries", forecast_dates = "2020-07-20", 
+        signal = c("confirmed_incidence_num", "deaths_incidence_num", "deaths_cumulative_num"), 
+        incidence_period = "day")
get_token(): POST:
get_resource(): GET:
get_resource(): GET:
[1] "Grabbing forecasts from Zoltar..."
Error: POST status was not 200. status_code=400, json_response=Invalid query. error_messages='["target with name not found. 
+        name=1 day ahead inc case, valid names=['17 day ahead inc hosp', '17 wk ahead cum death', '17 wk ahead inc death', 
+        '18 day ahead cum death', '18 day ahead inc death', '2 wk ahead inc case', '101 day ahead inc hosp', '102 day ahead 
+        cum death', '102 day ahead inc death', '102 day ahead inc hosp', 

Comparison to get_covidhub_predictions

This differs from get_covidhub_predictions's behavior. get_covidhub_predictions interprets incidence_period = c("epiweek", "day"), the default setting, as-is (in contrast to the documentation) and fetches predictions for both period types. This means that the two functions are not interchangeable.

Expected behavior

dshemetov commented 1 year ago

Just want to link this issue with #99 and #586. The first one looks like a record of attempts to make Zoltar supersede our scraping functions, the second is the bug we had a few months back.