Closed vpnagraj closed 3 years ago
@chulmelowe when you're ready to share some example code drop it in a comment like so:
library(tidyverse)
library(focustools)
## Get national data
national <- inner_join(
get_cases( source="jhu", granularity="national"),
get_deaths(source="jhu", granularity="national"),
by = c("location", "epiyear", "epiweek"))
## Get state data
state <- inner_join(
get_cases( source="jhu", granularity="state"),
get_deaths(source="jhu", granularity="state"),
by = c("location", "epiyear", "epiweek"))
## combine US and state data
usafull <-
bind_rows(national, state) %>%
filter(location %in% c("US", stringr::str_pad(1:56, width=2, pad="0"))) %>%
make_tsibble() %>%
filter(monday>"2020-03-09")
Here's some example code using the wis function to calculate the weighted interval and relative weighted interval scores:
library(tidyverse)
library(focustools)
library(covidHubUtils)
library(fabletools)
library(fable)
# Read in data for forecasting
usac <- focustools::get_cases()
usad <- focustools::get_deaths()
usa <- dplyr::inner_join(usac, usad, by = c("location", "epiyear", "epiweek")) %>%
focustools::make_tsibble(chop = TRUE)
# Chop off the last four weeks to use as a scoring data set
dates <- sort(unique(usa$monday), decreasing = TRUE)
train_set <- usa %>%
filter(monday %in% dates[5:length(dates)])
# Fit models for incident cases and incident deaths
fit_icases <- train_set %>%
model(arima = ARIMA(icases, stepwise = F, approximation = F))
fit_ideaths <- train_set %>%
model(linear_caselag3 = TSLM(ideaths ~ lag(icases, 3)))
# Forecast incident cases, incident deaths, and cumulative deaths
forecast_icases <- ts_forecast(fit_icases, outcome = "icases", horizon = 4)
future_cases <- ts_futurecases(train_set, forecast_icases, horizon = 4)
forecast_ideaths <- ts_forecast(fit_ideaths, outcome = "ideaths", new_data = future_cases)
forecast_cdeaths <- ts_forecast(outcome = "cdeaths", .data = train_set, inc_forecast = forecast_ideaths)
# Format the forecasts for submission to the COVID Hub
submission <- list(
focustools::format_for_submission(forecast_icases, target_name = "inc case"),
focustools::format_for_submission(forecast_ideaths, target_name = "inc death"),
focustools::format_for_submission(forecast_cdeaths, target_name = "cum death")
) %>%
reduce(bind_rows) %>%
arrange(target)
# Load the formatted truth data
td_ic <- covidHubUtils::load_truth("JHU", target_variable = "inc case")
td_id <- covidHubUtils::load_truth("JHU", target_variable = "inc death")
td_cd <- covidHubUtils::load_truth("JHU", target_variable = "cum death")
truth_set <- rbind(td_ic, td_id, td_cd)
# Score the forecasts for the last four weeks.
f_scores <- wis(submission, truth_set)
colnames(f_scores)[ncol(f_scores)] <- "sigsci_score"
# Import the baseline model forecasts
base_forecast <- covidHubUtils::load_forecasts(models = "COVIDhub-baseline", forecast_dates = "2021-02-15", locations = "US", source = "zoltar")
# Compute WIS for the baseline model
bl_scores <- wis(base_forecast, truth_set)
colnames(bl_scores)[ncol(bl_scores)] <- "baseline_score"
# Merge the datasets and compute the relative WIS
rel_wis <- f_scores %>%
left_join(bl_scores) %>%
mutate(relative_wis = sigsci_score / baseline_score)
rel_wis
@chulmelowe i was able to run through your example code. by the end i'm seeing rel_wis
:
target_variable | target_end_date | target | location | sigsci_score | baseline_score | relative_wis |
---|---|---|---|---|---|---|
cum death | 2021-02-20 | 1 wk ahead cum death | US | 975.3676 | 5604.321 | 0.1740385 |
cum death | 2021-02-27 | 2 wk ahead cum death | US | 943.7010 | 11421.868 | 0.0826223 |
cum death | 2021-03-06 | 3 wk ahead cum death | US | 1140.2562 | 18458.353 | 0.0617745 |
cum death | 2021-03-13 | 4 wk ahead cum death | US | 1485.3530 | 27528.306 | 0.0539573 |
inc case | 2021-02-20 | 1 wk ahead inc case | US | 9529.0611 | 67639.508 | 0.1408801 |
inc case | 2021-02-27 | 2 wk ahead inc case | US | 17338.8861 | 52569.713 | 0.3298265 |
inc case | 2021-03-06 | 3 wk ahead inc case | US | 21317.8083 | 65661.062 | 0.3246644 |
inc case | 2021-03-13 | 4 wk ahead inc case | US | 23499.7167 | 82243.543 | 0.2857333 |
inc death | 2021-02-20 | 1 wk ahead inc death | US | 975.3676 | 6881.281 | 0.1417422 |
inc death | 2021-02-27 | 2 wk ahead inc death | US | 354.9254 | 5588.525 | 0.0635097 |
inc death | 2021-03-06 | 3 wk ahead inc death | US | 368.0562 | 6727.787 | 0.0547069 |
inc death | 2021-03-13 | 4 wk ahead inc death | US | 353.4968 | 8714.365 | 0.0405648 |
that relative WIS for the SigSci-TS is really low. i think there might be an issue with how the WIS is being calculated ...
to investigate a little further, i tried calculating the WIS using code based on the C19FH scoring script (https://github.com/reichlab/covid19-forecast-evals/blob/main/reports/score_forecasts_eyc_update.R). here's the comparison of the WIS scores:
c19fh_wis | sigsci_wis | horizon | temporal_resolution | target_variable | target_end_date |
---|---|---|---|---|---|
1434.1904 | 975.3676 | 1 | wk | cum death | 2021-02-20 |
21136.0143 | 9529.0611 | 1 | wk | inc case | 2021-02-20 |
1434.1904 | 975.3676 | 1 | wk | inc death | 2021-02-20 |
1770.9587 | 943.7010 | 2 | wk | cum death | 2021-02-27 |
39617.7071 | 17338.8861 | 2 | wk | inc case | 2021-02-27 |
756.9683 | 354.9254 | 2 | wk | inc death | 2021-02-27 |
2354.5570 | 1140.2562 | 3 | wk | cum death | 2021-03-06 |
51375.5071 | 21317.8083 | 3 | wk | inc case | 2021-03-06 |
770.0352 | 368.0562 | 3 | wk | inc death | 2021-03-06 |
3099.5065 | 1485.3530 | 4 | wk | cum death | 2021-03-13 |
59398.2714 | 23499.7167 | 4 | wk | inc case | 2021-03-13 |
754.0800 | 353.4968 | 4 | wk | inc death | 2021-03-13 |
all of the code to reproduce is available below. wrote that fairly quickly, you may want to double check the joins (you too @stephenturner). note that the original code is largely unchanged, but i did add filter(monday>"2020-03-09")
when prepping the data. we're doing that in the submission script now because there are gaps in some locations prior to that date. i also noticed that without that filter there were warnings generated with ARIMA modeling.
last things ... i did have to modify the C19FH code to remove an argument for "count_median_twice" (deprecated? removed altogether from scoringutils
?). i also changed the na.rm
behavior in the WIS calculation because some of the inc case scores were returned as NA
library(tidyverse)
library(focustools)
library(covidHubUtils)
library(fabletools)
library(fable)
# Read in data for forecasting
usac <- focustools::get_cases()
usad <- focustools::get_deaths()
usa <- dplyr::inner_join(usac, usad, by = c("location", "epiyear", "epiweek")) %>%
focustools::make_tsibble(chop = TRUE) %>%
filter(monday>"2020-03-09")
# Chop off the last four weeks to use as a scoring data set
dates <- sort(unique(usa$monday), decreasing = TRUE)
train_set <- usa %>%
filter(monday %in% dates[5:length(dates)])
# Fit models for incident cases and incident deaths
fit_icases <- train_set %>%
model(arima = ARIMA(icases, stepwise = F, approximation = F))
fit_ideaths <- train_set %>%
model(linear_caselag3 = TSLM(ideaths ~ lag(icases, 3)))
# Forecast incident cases, incident deaths, and cumulative deaths
forecast_icases <- ts_forecast(fit_icases, outcome = "icases", horizon = 4)
future_cases <- ts_futurecases(train_set, forecast_icases, horizon = 4)
forecast_ideaths <- ts_forecast(fit_ideaths, outcome = "ideaths", new_data = future_cases)
forecast_cdeaths <- ts_forecast(outcome = "cdeaths", .data = train_set, inc_forecast = forecast_ideaths)
# Format the forecasts for submission to the COVID Hub
submission <- list(
focustools::format_for_submission(forecast_icases, target_name = "inc case"),
focustools::format_for_submission(forecast_ideaths, target_name = "inc death"),
focustools::format_for_submission(forecast_cdeaths, target_name = "cum death")
) %>%
reduce(bind_rows) %>%
arrange(target)
# Load the formatted truth data
td_ic <- covidHubUtils::load_truth("JHU", target_variable = "inc case")
td_id <- covidHubUtils::load_truth("JHU", target_variable = "inc death")
td_cd <- covidHubUtils::load_truth("JHU", target_variable = "cum death")
truth_set <- rbind(td_ic, td_id, td_cd)
# Score the forecasts for the last four weeks.
f_scores <- wis(submission, truth_set)
colnames(f_scores)[ncol(f_scores)] <- "sigsci_score"
# Import the baseline model forecasts
base_forecast <- covidHubUtils::load_forecasts(models = "COVIDhub-baseline", forecast_dates = "2021-02-15", locations = "US", source = "zoltar")
# Compute WIS for the baseline model
bl_scores <- wis(base_forecast, truth_set)
colnames(bl_scores)[ncol(bl_scores)] <- "baseline_score"
# Merge the datasets and compute the relative WIS
rel_wis <- f_scores %>%
left_join(bl_scores) %>%
mutate(relative_wis = sigsci_score / baseline_score)
rel_wis
##################################################
library(scoringutils)
## now try WIS calculation per C19FH methods
## use same forecast / truth_set inputs
submission <-
submission %>%
## prep the submission object per the scoringutils specs
## grab the target name (i.e. "inc case")
dplyr::mutate(target_variable = gsub(pattern = "[[:digit:]] wk ahead ", replacement = "", x = target)) %>%
## add a column for model name
mutate(model = "SigSci-TS") %>%
## get horizon and "wk" separated out
separate(target, into = c("horizon", "temporal_resolution", "tmp1", "tmp2", "tmp3"), sep = " ") %>%
select(-tmp1,-tmp2, -tmp3)
## join the submisison to truth_set (defined above)
## clean up columns afterwards
# get dataframe into scoringutil format
joint_df <- dplyr::left_join(x = submission, y = truth_set,
by = c("location", "target_variable", "target_end_date")) %>%
dplyr::select(-c("model.y")) %>%
dplyr::rename(model = model.x, prediction = value.x, true_value = value.y) %>%
dplyr::filter(!is.na(true_value))
# score using scoringutil
observation_cols <- c(
"model",
"location",
"horizon", "temporal_resolution", "target_variable",
"forecast_date", "target_end_date"
)
# creates placeholder variables to store the name of the column from scoringutils::eval_forecasts() to
# take values from (`abs_var`) and the column name to rename as "abs_error" (`abs_var_rename`)
abs_var <- "aem"
abs_var_rename <- "aem_0"
c19fh_scores <- tibble::tibble(scoringutils::eval_forecasts(data = joint_df,
by = observation_cols,
summarise_by = c(observation_cols, "range"),
## the below interval_score_arguments should ensure that WIS is computed correctly
interval_score_arguments = list(weigh = TRUE))) %>%
tidyr::pivot_wider(id_cols = observation_cols,
names_from = c("range"),
values_from = c("coverage", "interval_score", abs_var, "sharpness", "overprediction", "underprediction")) %>%
purrr::set_names(~sub(abs_var_rename, "abs_error", .x)) %>%
## need to remove all columns ending with NA to not affect WIS calculations
dplyr::select(
-dplyr::ends_with("_NA")
) %>%
## before next lines: do we need to check to ensure interval_score columns exist?
## the following lines ensure that we use denominator for the wis of
## (# of interval_scores)-0.5
## which is written in the paper and elsewhere as
## (# of interval_scores at level >0 ) + 0.5 or (K + 1/2)
## to make sure that the median only gets half the weight of the other
## intervals, multiply its value by 0.5
dplyr::mutate(
n_interval_scores = rowSums(!is.na(dplyr::select(., dplyr::starts_with("interval_score")))),
exists_interval_score_0 = "interval_score_0" %in% names(.),
interval_score_0 = ifelse(exists_interval_score_0, 0.5 * interval_score_0, NA_real_),
sharpness_0 = ifelse(exists_interval_score_0, 0.5 * sharpness_0, NA_real_),
underprediction_0 = ifelse(exists_interval_score_0, 0.5 * underprediction_0, NA_real_),
overprediction_0 = ifelse(exists_interval_score_0, 0.5 * overprediction_0, NA_real_)) %>%
dplyr::mutate(
wis = rowSums(dplyr::select(., dplyr::starts_with("interval_score")), na.rm = TRUE)/(n_interval_scores-0.5*(exists_interval_score_0)),
) %>%
dplyr::mutate(
sharpness = rowSums(dplyr::select(., dplyr::starts_with("sharpness")), na.rm = FALSE)/(n_interval_scores-0.5*(exists_interval_score_0)),
overprediction = rowSums(dplyr::select(., dplyr::starts_with("overprediction")), na.rm = FALSE)/(n_interval_scores-0.5*(exists_interval_score_0)),
underprediction = rowSums(dplyr::select(., dplyr::starts_with("underprediction")), na.rm = FALSE)/(n_interval_scores-0.5*(exists_interval_score_0))
) %>%
dplyr::select(
-dplyr::starts_with("aem_"),
-dplyr::starts_with("ae_point_"),
-dplyr::starts_with("interval_score"),
-dplyr::starts_with("sharpness_"),
-dplyr::starts_with("underprediction_"),
-dplyr::starts_with("overprediction_")
)
## compare the wis computed with wis() vs c19fh wis ...
## first some clean up to f_scores above
f_scores_clean <-
f_scores %>%
mutate(target = as.character(target)) %>%
separate(target, into = c("horizon", "temporal_resolution", "tmp1", "tmp2", "tmp3"), sep = " ") %>%
select(sigsci_wis = sigsci_score, horizon, temporal_resolution, target_variable, target_end_date)
## now join to cleaned up c19fh scores and situate side-by-side
c19fh_scores %>%
select(c19fh_wis = wis, horizon, temporal_resolution, target_variable, target_end_date) %>%
left_join(f_scores_clean) %>%
select(c19fh_wis, sigsci_wis, everything())
@chulmelowe given that we're seeing some discrepancies in the our WIS versus the C19FH im not sure it makes sense to maintain our wis()
in the package.
i'm going to close this PR without merging ... but the branch will still be up in case we want to revisit wis()
later
@chulmelowe i just pushed a few changes to the branch. see commits enumerated in the thread above
these are pkg dev fixes.
the first one (7f6f7ff) creates the .Rd documentation and adds the function to the NAMESPACE. always run
devtools::document()
prior to pushingthe second one gets rid of the
usethis::use_pipe()
. you dont need that in there ... you only need to run that once in your workspace when you're developing the package to add the%>%
operator to the NAMESPACE (https://github.com/signaturescience/focustools/blob/master/NAMESPACE#L3). once it's there the pipe is available to any package functions.the last two were fixes per
devtools::check()
. make sure you check the package before pushing. that was the first thing i did when i pulled down your code. here's what i saw:lots of solutions to the "no visible binding" message in google:
https://community.rstudio.com/t/how-to-solve-no-visible-binding-for-global-variable-note/28887
https://community.rstudio.com/t/when-programming-with-dplyr-what-is-the-correct-way-to-avoid-undefined-global-variables/55946
you can check the commit history and look at focustools.R on this branch to see how i resolved. after fixed we're looking good with
devtools::check()
:so the package dev stuff looks good. but before we merge it would be helpful to see a reproducible example of how you actually use this function. not necessarily for the package documentation (yet) but at the very least dropped in a comment on this thread so i can understand better what this is supposed to do.
i have some stylistic code comments too but i'll save those for after @stephenturner takes a look.