Closed dajmcdon closed 3 years ago
Tiny comment: we may as well expose get_covid_hub_forecast_names()
to be a public function. This would be useful to have. And, for consistency, I would change covid_hub
in this name to covidhub
. And, I would make that change throughout (several argument names appear to have covid_hub
in them, rather than covidhub
, which you can catch with by running grep "covid_hub" evalcast/R/*.R
from the command line within the R-packages directory).
I am confused by the output/documentation of this function in other aspects; evalcast::get_covidhub_predictions("COVIDhub-baseline", as.Date("2020-09-07"))
produces an unnamed list of 20 cards, some with 7-row forecast_distribution
s, some with 23-row forecast_distribution
s. Looking at the csv downloaded, there appear to be only 16 distinct target
s though, or 12 "inc"-type target
s; I am not sure why there are 20 cards or what these 20 cards are. Not sure if it is my version of packages resulting in unnamed lists or other issues.
> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] evalcast_0.1.1 testthat_3.0.0 covidcast_0.3.0 tibble_3.0.4
[5] pipeR_0.6.1.3
loaded via a namespace (and not attached):
[1] zoo_1.8-8 tidyselect_1.1.0 remotes_2.2.0 purrr_0.3.4
[5] lattice_0.20-41 colorspace_1.4-1 vctrs_0.3.4 generics_0.1.0
[9] usethis_1.6.3 utf8_1.1.4 rlang_0.4.8 pkgbuild_1.1.0
[13] pillar_1.4.6 foreign_0.8-70 glue_1.4.2 withr_2.3.0
[17] sp_1.4-4 sessioninfo_1.1.1 lifecycle_0.2.0 stringr_1.4.0
[21] munsell_0.5.0 gtable_0.3.0 rvest_0.3.6 devtools_2.3.2
[25] memoise_1.1.0 labeling_0.4.2 callr_3.5.1 ps_1.4.0
[29] maptools_1.0-2 curl_4.3 fansi_0.4.1 Rcpp_1.0.5
[33] readr_1.4.0 backports_1.1.10 scales_1.1.1 desc_1.2.0
[37] pkgload_1.1.0 jsonlite_1.7.1 farver_2.0.3 MMWRweek_0.1.3
[41] fs_1.5.0 ggplot2_3.3.2 hms_0.5.3 digest_0.6.27
[45] stringi_1.5.3 processx_3.4.4 dplyr_1.0.2 rprojroot_1.3-2
[49] grid_3.5.0 cli_2.1.0 tools_3.5.0 magrittr_1.5
[53] crayon_1.3.4 tidyr_1.1.2 usmap_0.5.1 pkgconfig_2.0.3
[57] ellipsis_0.3.1 xml2_1.3.2 prettyunits_1.1.1 lubridate_1.7.9
[61] assertthat_0.2.1 httr_1.4.2 rstudioapi_0.11 R6_2.5.0
[65] compiler_3.5.0
> get_covidhub_predictions("COVIDhub-baseline", as.Date("2020-09-07"))
[[1]]
# A tibble: 3,142 x 2
location forecast_distribution
* <chr> <list>
1 01001 <tibble [7 × 2]>
2 01003 <tibble [7 × 2]>
3 01005 <tibble [7 × 2]>
4 01007 <tibble [7 × 2]>
5 01009 <tibble [7 × 2]>
6 01011 <tibble [7 × 2]>
7 01013 <tibble [7 × 2]>
8 01015 <tibble [7 × 2]>
9 01017 <tibble [7 × 2]>
10 01019 <tibble [7 × 2]>
# … with 3,132 more rows
[[2]]
# A tibble: 57 x 2
location forecast_distribution
* <chr> <list>
1 01 <tibble [7 × 2]>
2 02 <tibble [7 × 2]>
3 04 <tibble [7 × 2]>
4 05 <tibble [7 × 2]>
5 06 <tibble [7 × 2]>
6 08 <tibble [7 × 2]>
7 09 <tibble [7 × 2]>
8 10 <tibble [7 × 2]>
9 11 <tibble [7 × 2]>
10 12 <tibble [7 × 2]>
# … with 47 more rows
[[3]]
# A tibble: 57 x 2
location forecast_distribution
* <chr> <list>
1 01 <tibble [23 × 2]>
2 02 <tibble [23 × 2]>
3 04 <tibble [23 × 2]>
4 05 <tibble [23 × 2]>
5 06 <tibble [23 × 2]>
6 08 <tibble [23 × 2]>
7 09 <tibble [23 × 2]>
8 10 <tibble [23 × 2]>
9 11 <tibble [23 × 2]>
10 12 <tibble [23 × 2]>
# … with 47 more rows
[[4]]
# A tibble: 3,142 x 2
location forecast_distribution
* <chr> <list>
1 01001 <tibble [7 × 2]>
2 01003 <tibble [7 × 2]>
3 01005 <tibble [7 × 2]>
4 01007 <tibble [7 × 2]>
5 01009 <tibble [7 × 2]>
6 01011 <tibble [7 × 2]>
7 01013 <tibble [7 × 2]>
8 01015 <tibble [7 × 2]>
9 01017 <tibble [7 × 2]>
10 01019 <tibble [7 × 2]>
# … with 3,132 more rows
[[5]]
# A tibble: 57 x 2
location forecast_distribution
* <chr> <list>
1 01 <tibble [7 × 2]>
2 02 <tibble [7 × 2]>
3 04 <tibble [7 × 2]>
4 05 <tibble [7 × 2]>
5 06 <tibble [7 × 2]>
6 08 <tibble [7 × 2]>
7 09 <tibble [7 × 2]>
8 10 <tibble [7 × 2]>
9 11 <tibble [7 × 2]>
10 12 <tibble [7 × 2]>
# … with 47 more rows
[[6]]
# A tibble: 57 x 2
location forecast_distribution
* <chr> <list>
1 01 <tibble [23 × 2]>
2 02 <tibble [23 × 2]>
3 04 <tibble [23 × 2]>
4 05 <tibble [23 × 2]>
5 06 <tibble [23 × 2]>
6 08 <tibble [23 × 2]>
7 09 <tibble [23 × 2]>
8 10 <tibble [23 × 2]>
9 11 <tibble [23 × 2]>
10 12 <tibble [23 × 2]>
# … with 47 more rows
[[7]]
# A tibble: 3,142 x 2
location forecast_distribution
* <chr> <list>
1 01001 <tibble [7 × 2]>
2 01003 <tibble [7 × 2]>
3 01005 <tibble [7 × 2]>
4 01007 <tibble [7 × 2]>
5 01009 <tibble [7 × 2]>
6 01011 <tibble [7 × 2]>
7 01013 <tibble [7 × 2]>
8 01015 <tibble [7 × 2]>
9 01017 <tibble [7 × 2]>
10 01019 <tibble [7 × 2]>
# … with 3,132 more rows
[[8]]
# A tibble: 57 x 2
location forecast_distribution
* <chr> <list>
1 01 <tibble [7 × 2]>
2 02 <tibble [7 × 2]>
3 04 <tibble [7 × 2]>
4 05 <tibble [7 × 2]>
5 06 <tibble [7 × 2]>
6 08 <tibble [7 × 2]>
7 09 <tibble [7 × 2]>
8 10 <tibble [7 × 2]>
9 11 <tibble [7 × 2]>
10 12 <tibble [7 × 2]>
# … with 47 more rows
[[9]]
# A tibble: 57 x 2
location forecast_distribution
* <chr> <list>
1 01 <tibble [23 × 2]>
2 02 <tibble [23 × 2]>
3 04 <tibble [23 × 2]>
4 05 <tibble [23 × 2]>
5 06 <tibble [23 × 2]>
6 08 <tibble [23 × 2]>
7 09 <tibble [23 × 2]>
8 10 <tibble [23 × 2]>
9 11 <tibble [23 × 2]>
10 12 <tibble [23 × 2]>
# … with 47 more rows
[[10]]
# A tibble: 3,142 x 2
location forecast_distribution
* <chr> <list>
1 01001 <tibble [7 × 2]>
2 01003 <tibble [7 × 2]>
3 01005 <tibble [7 × 2]>
4 01007 <tibble [7 × 2]>
5 01009 <tibble [7 × 2]>
6 01011 <tibble [7 × 2]>
7 01013 <tibble [7 × 2]>
8 01015 <tibble [7 × 2]>
9 01017 <tibble [7 × 2]>
10 01019 <tibble [7 × 2]>
# … with 3,132 more rows
[[11]]
# A tibble: 57 x 2
location forecast_distribution
* <chr> <list>
1 01 <tibble [7 × 2]>
2 02 <tibble [7 × 2]>
3 04 <tibble [7 × 2]>
4 05 <tibble [7 × 2]>
5 06 <tibble [7 × 2]>
6 08 <tibble [7 × 2]>
7 09 <tibble [7 × 2]>
8 10 <tibble [7 × 2]>
9 11 <tibble [7 × 2]>
10 12 <tibble [7 × 2]>
# … with 47 more rows
[[12]]
# A tibble: 57 x 2
location forecast_distribution
* <chr> <list>
1 01 <tibble [23 × 2]>
2 02 <tibble [23 × 2]>
3 04 <tibble [23 × 2]>
4 05 <tibble [23 × 2]>
5 06 <tibble [23 × 2]>
6 08 <tibble [23 × 2]>
7 09 <tibble [23 × 2]>
8 10 <tibble [23 × 2]>
9 11 <tibble [23 × 2]>
10 12 <tibble [23 × 2]>
# … with 47 more rows
[[13]]
# A tibble: 3,142 x 2
location forecast_distribution
* <chr> <list>
1 01001 <tibble [7 × 2]>
2 01003 <tibble [7 × 2]>
3 01005 <tibble [7 × 2]>
4 01007 <tibble [7 × 2]>
5 01009 <tibble [7 × 2]>
6 01011 <tibble [7 × 2]>
7 01013 <tibble [7 × 2]>
8 01015 <tibble [7 × 2]>
9 01017 <tibble [7 × 2]>
10 01019 <tibble [7 × 2]>
# … with 3,132 more rows
[[14]]
# A tibble: 57 x 2
location forecast_distribution
* <chr> <list>
1 01 <tibble [7 × 2]>
2 02 <tibble [7 × 2]>
3 04 <tibble [7 × 2]>
4 05 <tibble [7 × 2]>
5 06 <tibble [7 × 2]>
6 08 <tibble [7 × 2]>
7 09 <tibble [7 × 2]>
8 10 <tibble [7 × 2]>
9 11 <tibble [7 × 2]>
10 12 <tibble [7 × 2]>
# … with 47 more rows
[[15]]
# A tibble: 3,142 x 2
location forecast_distribution
* <chr> <list>
1 01001 <tibble [7 × 2]>
2 01003 <tibble [7 × 2]>
3 01005 <tibble [7 × 2]>
4 01007 <tibble [7 × 2]>
5 01009 <tibble [7 × 2]>
6 01011 <tibble [7 × 2]>
7 01013 <tibble [7 × 2]>
8 01015 <tibble [7 × 2]>
9 01017 <tibble [7 × 2]>
10 01019 <tibble [7 × 2]>
# … with 3,132 more rows
[[16]]
# A tibble: 57 x 2
location forecast_distribution
* <chr> <list>
1 01 <tibble [7 × 2]>
2 02 <tibble [7 × 2]>
3 04 <tibble [7 × 2]>
4 05 <tibble [7 × 2]>
5 06 <tibble [7 × 2]>
6 08 <tibble [7 × 2]>
7 09 <tibble [7 × 2]>
8 10 <tibble [7 × 2]>
9 11 <tibble [7 × 2]>
10 12 <tibble [7 × 2]>
# … with 47 more rows
[[17]]
# A tibble: 3,142 x 2
location forecast_distribution
* <chr> <list>
1 01001 <tibble [7 × 2]>
2 01003 <tibble [7 × 2]>
3 01005 <tibble [7 × 2]>
4 01007 <tibble [7 × 2]>
5 01009 <tibble [7 × 2]>
6 01011 <tibble [7 × 2]>
7 01013 <tibble [7 × 2]>
8 01015 <tibble [7 × 2]>
9 01017 <tibble [7 × 2]>
10 01019 <tibble [7 × 2]>
# … with 3,132 more rows
[[18]]
# A tibble: 57 x 2
location forecast_distribution
* <chr> <list>
1 01 <tibble [7 × 2]>
2 02 <tibble [7 × 2]>
3 04 <tibble [7 × 2]>
4 05 <tibble [7 × 2]>
5 06 <tibble [7 × 2]>
6 08 <tibble [7 × 2]>
7 09 <tibble [7 × 2]>
8 10 <tibble [7 × 2]>
9 11 <tibble [7 × 2]>
10 12 <tibble [7 × 2]>
# … with 47 more rows
[[19]]
# A tibble: 3,142 x 2
location forecast_distribution
* <chr> <list>
1 01001 <tibble [7 × 2]>
2 01003 <tibble [7 × 2]>
3 01005 <tibble [7 × 2]>
4 01007 <tibble [7 × 2]>
5 01009 <tibble [7 × 2]>
6 01011 <tibble [7 × 2]>
7 01013 <tibble [7 × 2]>
8 01015 <tibble [7 × 2]>
9 01017 <tibble [7 × 2]>
10 01019 <tibble [7 × 2]>
# … with 3,132 more rows
[[20]]
# A tibble: 57 x 2
location forecast_distribution
* <chr> <list>
1 01 <tibble [7 × 2]>
2 02 <tibble [7 × 2]>
3 04 <tibble [7 × 2]>
4 05 <tibble [7 × 2]>
5 06 <tibble [7 × 2]>
6 08 <tibble [7 × 2]>
7 09 <tibble [7 × 2]>
8 10 <tibble [7 × 2]>
9 11 <tibble [7 × 2]>
10 12 <tibble [7 × 2]>
# … with 47 more rows
Yes I was confused by the dimension mismatch too but assumed it was just something I didn't understand about our code.
It's giving you cases/deaths/states/counties/ahead(1-5) (I think this accounts for the 20).You can check which with attributes()
applied to one of the tibbles in the list. As to why some have 7 instead of 23 quantiles, I'm not sure. The expectation is that we submit for 23, but I don't know why some combinations would result in only a subset of those. Is it filtering out incorrectly from the csv?
The plan is to remove this function and replace with the covidHubUtils. So if this is being buggy, better to fix by importing from there and converting to a list of cards with appropriate attributes than to try to fix this one.
@brookslogan If you look at COVIDHub's technical README, they say that for the "N wk ahead inc case" target, 7 quantiles should be specified. (Actually they say 6, but this is a typo)... in particular, c(0.025, 0.100, 0.250, 0.500, 0.750, 0.900, 0.975)
.
A few follow up comments:
If you want to only select for the deaths target, you could use the following:
cards <- evalcast::get_covidhub_predictions("COVIDhub-baseline",
forecast_dates = as.Date("2020-09-07"),
response_data_source = "jhu-csse",
response_signal = "deaths_incidence_num")
This returns just 4 predictions cards. An alternative is to download all as you've done and then use evalcast::filter_predictions()
in a subsequent step.
Your issue wondering what these 20 cards represent is precisely the motivation for having a nice print function implemented which would display the key attributes (such as ahead, etc.). Issue #222 will address this. As I describe in #98, we could see something like
forecaster name: CMU-TimeSeries; ahead: 2; incidence_period: "epiweek"; geo_type: "state"; forecast_date: 2020-10-01
Until the print method has been implemented, you can use the (not exported) function evalcast:::all_attr()
. For example,
> evalcast:::all_attr(cards,"ahead")
[[1]]
[1] 1
[[2]] [1] 2
[[3]] [1] 3
[[4]] [1] 4