Closed stephenturner closed 3 years ago
Taking a look at this reveals lots of inflexibility throughout. This is going to reopen at least a few issues. Starting with #16, #6, #3 cc @vpnagraj
fair enough. stating the obvious, but make sure any edits go in a new branch so that we keep main clean for next weeks submission
i have the common "location" column in get_*
data functions now. seeing a couple of other places i need to make changes for to actually use that in the ts pipeline. will be pushing changes to https://github.com/signaturescience/focustools/tree/state-level-ts
so the "key" concept in fable
is 🔮
pull the state-level-ts
branch and try this:
## get data at the national scale from jhu source
# usac <- get_cases(source="jhu", granularity = "national")
# usad <- get_deaths(source="jhu", granularity = "national")
## get data at the state scale from jhu source
usac <- get_cases(source="jhu", granularity = "state")
usad <- get_deaths(source="jhu", granularity = "state")
## use the focustools helper to prep the tsibble format
usa <-
dplyr::inner_join(usac, usad, by = c("epiyear", "epiweek", "location")) %>%
make_tsibble(chop=TRUE)
fit.icases <- usa %>% model(arima = ARIMA(icases, stepwise=FALSE, approximation=FALSE))
fit.ideaths <- usa %>% model(linear_caselag3 = TSLM(ideaths ~ lag(icases, 3)))
## generate incident case forecast
icases_forecast <- ts_forecast(fit.icases, outcome = "icases", horizon = 4)
icases_forecast
## need to get future cases to pass to ideaths forecast
future_cases <- ts_futurecases(usa, icases_forecast, horizon = 4)
ideaths_forecast <- ts_forecast(fit.ideaths, outcome = "ideaths", new_data = future_cases)
ideaths_forecast
cdeaths_forecast <- ts_forecast(outcome = "cdeaths", .data = usa, inc_forecast = ideaths_forecast)
cdeaths_forecast
submission <-
list(format_for_submission(icases_forecast, target_name = "inc case"),
format_for_submission(ideaths_forecast, target_name = "inc death"),
format_for_submission(cdeaths_forecast, target_name = "cum death")) %>%
reduce(bind_rows) %>%
arrange(target)
submission
forecast_date | target | target_end_date | location | type | quantile | value |
---|---|---|---|---|---|---|
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Alabama | point | NA | 6346 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Alaska | point | NA | 319 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | American Samoa | point | NA | 0 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Arizona | point | NA | 11139 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Arkansas | point | NA | 4503 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | California | point | NA | 31579 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Colorado | point | NA | 5365 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Connecticut | point | NA | 6803 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Delaware | point | NA | 1154 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | District of Columbia | point | NA | 1396 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Florida | point | NA | 23745 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Georgia | point | NA | 12389 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Guam | point | NA | 420 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Hawaii | point | NA | 640 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Idaho | point | NA | 1788 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Illinois | point | NA | 19533 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Indiana | point | NA | 9044 |
2021-01-18 | 1 wk ahead cum death | 2021-01-23 | Iowa | point | NA | 4463 |
^ same exact API as our national forecast. this just works with state data now. thanks to a little elbow grease in some of the internals of our functions of course. haven't tried it with counties (i don't think we should ... ) but theoretically it would work for that scale too
still more to do there in terms of prepping the submission format. i think the "location" column needs to be the state codes instead of names? and i'm not sure which territories are supposed to be included. but this puts us pretty far down the path.
This is awesome. Running it for the first time now. After a little impromptu hacking to set the date to monday and the negatives to zero (#31) the thing actually validates!
library(tidyverse)
library(focustools)
# Run the forecast pipeline. See ?forecast_pipeline
myforecast <- forecast_pipeline(granularity="state", force=TRUE)
sub <- myforecast$submission
sub <-
sub %>%
mutate(value=ifelse(value<0, 0L, value)) %>%
mutate(forecast_date=this_monday())
fn <- myforecast$submission_filename
# Look at the submission and the suggested filename.
sub
fn
# Write submission to file
readr::write_csv(sub, file=fn)
# Validate submission
validate_forecast(fn)
* validating quantile_csv_file '/Users/sturner/sigsci/irad/focustools/submission/SigSci-TS/2021-01-20-SigSci-TS.csv'...
$valid
[1] TRUE
$message
[1] "no errors"
Looks like fable threw a few warnings. We should probably look at these sometime. The first one is from the pipeline script. A rank-deficient fit sounds like an insult.
Warning messages:
1: In forecast_pipeline(granularity = "state", force = TRUE) :
Forcing forecast on a non-Monday. This may not validate. Proceeding...
2: Problem with `mutate()` input `linear_caselag3`.
ℹ prediction from a rank-deficient fit may be misleading
ℹ Input `linear_caselag3` is `(function (object, ...) ...`.
3: prediction from a rank-deficient fit may be misleading
48 is TX, 51 is VA
are we good to close this issue out? The plot function - I think it's fine for a sanity check as originally intended in #21 but if we were to release or start to use elsewhere (eg #22) we might do some things like joining against location data to translate the fips to text, and allow the user to pass arguments or somehow adjust for scale. It makes sense to have free y scales for the targets, but would it make sense to have the same scale for each target as defined by the max on that particular scale for one particular location? I don't even know how to do this in ggplot2, may not be worth it.
@stephenturner fyi i realized when i merged the latest PR (#37) i forgot to restore the state-level-ts branch. it's back now for you to commit the hilo stuff:
https://github.com/signaturescience/focustools/tree/state-level-ts
Add specificity to this memo later. For now, some high level thoughts:
group_by
?