improved ILI forecasting

vpnagraj commented 2 years ago

the SigSci-CREG method uses count regression model to and forecast flu hospitalizations. the approach steps finds the best fitting model from a list of models, all of which currently include ILI as a predictor. the modeling step uses "observed" ILI (a mix of reported ILI via FluView augmented with ILI Nearby Nowcast for 2 most recent weeks). however, in order to forecast into the future we need predicted values of ILI 1-4 weeks ahead for each geographic unit.

given the importance of ILI as a predictor for SigSci-CREG, we need to make sure we optimize the ILI forecasting as best we can. the current method for forecasting ILI uses an ARIMA approach:

https://github.com/signaturescience/fiphde/blob/main/R/forecast.R#L106-L296

note that up to now we have also operated under the assumption that it is better to fit the models on more recent data (because the dynamics of ILI activity are likely different during COVID-19 pandemic):

https://github.com/signaturescience/fiphde/blob/main/submission/submission.R#L25-L29

a few initial ideas to consider:

alternate (S)ARIMA parameterization / restriction on parameter spaces
include older data in training
log (or logit?) transform on ILI prior to modeling (Lu and Meyer, doi:10.3390/ijerph17041381)
dummy variables for holidays, aberrant weeks (Lu and Meyer, doi: 10.3390/ijerph17041381)
beta regression ((Lu and Meyer, doi: 10.3390/ijerph17041381)
autoregressive neural network approach (Kandula and Shaman, doi: 10.1016/j.epidem.2019.01.002)

there are certainly lots of other ideas to consider. and because ILI was the target of previous FluSight challenges, there is plenty of literature on ILI forecasting methods.

one important thing to keep in mind is that we will need to think through how to evaluate what effect (if any) these alternate methods has on 1) ILI forecast performance and 2) downstream SigSci-CREG forecast performance. that evaluation will probably need to involve measuring performance across a range of time points (and geographic units) to see what is generally the most effective method.

stephenturner commented 2 years ago

I started looking at this, and the forecast_ili() function is extremely high on the hard-coded spectrum. You give it the ili data fetched as in the @examples and it fits models and forecasts ili in a pretty rigid manner, with ETS and ARIMA models nearly hardcoded.

A lot of the forecasting work for hospitalizations was split up into one function for getting data, another function for prepping data, another for forecasting, and another for formatting. Turns out, the same function used to more flexibly forecast hospitalization data can be used for forecasting ILI as well!

ilidat <- get_cdc_ili(region = c("national", "state", "hhs"),
                      years = 2010:lubridate::year(lubridate::today()))
ilidat_us <- 
  ilidat %>% 
  dplyr::filter(location=="US") %>% 
  replace_ili_nowcast(weeks_to_replace=1) %>% 
  make_tsibble(epiyear=epiyear, epiweek=epiweek)

ilifor <- ts_fit_forecast(ilidat_us, outcome="weighted_ili", covariates = NULL)
str(ilifor, 2)

> str(ilifor, 2)
List of 4
 $ tsfit        : mdl_df [1 × 4] (S3: mdl_df/tbl_df/tbl/data.frame)
  ..- attr(*, ".drop")= logi TRUE
  ..- attr(*, "response")= chr "weighted_ili"
  ..- attr(*, "key")= tibble [1 × 2] (S3: tbl_df/tbl/data.frame)
  .. ..- attr(*, ".drop")= logi TRUE
  ..- attr(*, "model")= chr [1:3] "arima" "ets" "ensemble"
 $ tsfor        : fbl_ts [12 × 5] (S3: fbl_ts/tbl_ts/tbl_df/tbl/data.frame)
  ..- attr(*, "key")= tibble [3 × 3] (S3: tbl_df/tbl/data.frame)
  .. ..- attr(*, ".drop")= logi TRUE
  ..- attr(*, "index")= chr "yweek"
  .. ..- attr(*, "ordered")= logi TRUE
  ..- attr(*, "index2")= chr "yweek"
  ..- attr(*, "interval")= interval [1:1] 1W
  ..- attr(*, "response")= chr "weighted_ili"
  ..- attr(*, "dist")= chr "weighted_ili"
  ..- attr(*, "model_cn")= chr ".model"
 $ arima_formula:Class 'formula'  language weighted_ili ~ PDQ(0, 0, 0) + pdq(1:2, 0:2, 0)
  .. ..- attr(*, ".Environment")=<environment: 0x7fcb075a2a00> 
 $ ets_formula  :Class 'formula'  language weighted_ili ~ season(method = "N")
  .. ..- attr(*, ".Environment")=<environment: 0x7fcb075a2a00>

The forecast is stored in the $tsfor object:

> ilifor$tsfor
# A fable: 12 x 5 [1W]
# Key:     location, .model [3]
   location .model      yweek  weighted_ili .mean
   <chr>    <chr>      <week>        <dist> <dbl>
 1 US       arima    2022 W03 N(3.7, 0.035)  3.65
 2 US       arima    2022 W04  N(3.7, 0.13)  3.72
 3 US       arima    2022 W05  N(3.8, 0.23)  3.80
 4 US       arima    2022 W06  N(3.8, 0.31)  3.83
 5 US       ets      2022 W03 N(3.6, 0.082)  3.60
 6 US       ets      2022 W04  N(3.4, 0.26)  3.41
 7 US       ets      2022 W05  N(3.2, 0.57)  3.22
 8 US       ets      2022 W06       N(3, 1)  3.03
 9 US       ensemble 2022 W03 N(3.6, 0.052)  3.63
10 US       ensemble 2022 W04  N(3.6, 0.17)  3.57
11 US       ensemble 2022 W05  N(3.5, 0.35)  3.51
12 US       ensemble 2022 W06  N(3.4, 0.57)  3.43

And can even be pulled out into a formatted object with ts_format_for_submission() (ignore the target name column):

> ts_format_for_submission(ilifor$tsfor)$ensemble
# A tibble: 96 × 7
   forecast_date target                  target_end_date location type     quantile value
   <date>        <chr>                   <date>          <chr>    <chr>    <chr>    <chr>
 1 2022-01-18    1 wk ahead inc flu hosp 2022-01-22      US       point    NA       4    
 2 2022-01-18    2 wk ahead inc flu hosp 2022-01-29      US       point    NA       4    
 3 2022-01-18    3 wk ahead inc flu hosp 2022-02-05      US       point    NA       4    
 4 2022-01-18    4 wk ahead inc flu hosp 2022-02-12      US       point    NA       4    
 5 2022-01-18    1 wk ahead inc flu hosp 2022-01-22      US       quantile 0.010    4    
 6 2022-01-18    2 wk ahead inc flu hosp 2022-01-29      US       quantile 0.010    3    
 7 2022-01-18    3 wk ahead inc flu hosp 2022-02-05      US       quantile 0.010    3    
 8 2022-01-18    4 wk ahead inc flu hosp 2022-02-12      US       quantile 0.010    2    
 9 2022-01-18    1 wk ahead inc flu hosp 2022-01-22      US       quantile 0.025    4    
10 2022-01-18    2 wk ahead inc flu hosp 2022-01-29      US       quantile 0.025    3    
# … with 86 more rows

With a very minor tweak to plot_forecast() (flu.admits is hard-coded), the forecast can be visualized using the same function:

# mod plot_forecast dplyr::select(...point=weighted_ili)
plot_forecast(ilidat_us %>% dplyr::mutate(week_end=week_start+6), 
              ilifor$tsfor %>% ts_format_for_submission() %>% .$ensemble)

All this is to say -- we've got slightly divergent methods for fitting time series forecasts:

Hospitalizations: get_hdgov_hosp() %>% prep_hdgov_hosp() %>% ts_fit_forecast() %>% ts_format_for_submission()
ILI: get_cdc_ili() %>% forecast_ili(), which rolls some of the prep, forecasting, and "formatting" into one function.

If one of the goals here is to:

that evaluation will probably need to involve measuring performance across a range of time points (and geographic units) to see what is generally the most effective method.

Then perhaps we might want to move ILI forecasting to the #-1 method above that we use for hospitalizations, and make the ts_fit_forecast() function more flexible (eg to allow for a nnet model with fable::NNETAR()). An advantage of this is that we can use infrastructure that we've kind of already got for computing WIS over an entire time series and different geographic regions, so that we're not just looking at plots above and saying "eh, looks right."

chulmelowe commented 2 years ago

I think Stephen has a good point. As I'm starting to look at fitting and assessing different models for ILI, having everything work with a common framework would be helpful. I can start looking at how to do this on Friday, unless one of you has already started on this effort.

vpnagraj commented 2 years ago

agree with @stephenturner that it would make sense to get a "one-size-fits-all" ts_fit_forecast() for different methods. actually probably wouldnt take too much engineering (at least for the types of models in fable)

but @chulmelowe rather than starting with that engineering bit ... it might make sense to start exploring what we can do with the fiphde functions as-is. for example, the forecast_ili() function does include parameters for the (S)ARIMA space and a "trim date" for training data:

https://github.com/signaturescience/fiphde/blob/main/R/forecast.R#L204

pretty sure it would be possible to use that to start exploring the first two ideas above:

alternate (S)ARIMA parameterization / restriction on parameter spaces
include older data in training

another thought is that it would be very useful to actually see an example of the fable::NNETAR() run on our ILI data ... even if its outside of the package framework for now.

but yeah ultimately if these look promising we can adapt package functions so we have a friendlier API to do the evaluation at scale.

stephenturner commented 2 years ago

From the Kandula paper:

A simple neural network (McCulloch and Pitts, 1943; Rosenblatt, 1962; Rumelhart et al., 1985; Werbos, 1974) that connects input nodes (explanatory variables) to a single output node (response) can mimic linear regression, with the regression coefficients given by the weights of the connectors. When a layer of nodes is added between the input and output layers, the regression becomes two-step as the network can extract linear combinations of the inputs as derived features, and through the use of an appropriate activation function (such as the sigmoid) can model the response as a nonlinear function of these features. As the intermediate layers are unobserved they are commonly referred to as hidden layers (Hyndman and Athanasopoulos, 2014; Hastie et al., 2009). Additional intermediate layers would allow the network to model more complex non-linear functions, but would also make finding a solution computationally more expensive. By using lagged values of the response as the inputs, the neural network can be implemented as an autoregressive model. To model a non-linear autoregressive function, as is required here, a neural network with a single hidden layer will suffice. Such a network is specified by three parameters - p, the number of lagged inputs; k, the number of nodes in the hidden layer; and for seasonal data, P, the number of previous seasons to consider - and can be denoted by NN(p, P, k). For example, a specification of NN (6, 2, 3) to denote values of p, P and k respectively; for a monthly time series with annual seasonality, implies that the model has 3 nodes in the hidden layer and uses observations of the previous 6 months and of the same month in the previous 2 years as input,

As in the above two methods, we used forecast package's implementation of an autoregressive neural network with the following parameters: P = 1; p is chosen so as to minimize AIC; and k = (p + P + 1)/2. To avoid overfitting through assignment of excessive weights on some of the connectors a decay parameter of 0.5 was used.

@chulmelowe

https://github.com/signaturescience/fiphde/blob/89978499e1fddd5ff78b1730ad94921b39374502/scratch/ilipredimprov.R#L18-L19

stephenturner commented 2 years ago

Hey @chulmelowe just checking in here, seeing if you've made any headway on improving the ILI forecasts and/or evaluating those improvements?

chulmelowe commented 2 years ago

I am making some headway, yes, and I'm planning to work on this more today, so watch this space.

chulmelowe commented 2 years ago

@vpnagraj, what is the locations object in https://github.com/signaturescience/fiphde/blob/main/R/forecast.R#L229? I can't find where that object is made or what it contains.

stephenturner commented 2 years ago

That's the internal data object created in the data-raw directory. You can access it interactively with fiphde:::locations, or better yet, devtools::load_all() in the package directory (ctrl-shift-L on a mac) and it'll be available as just locations without the namespace.

stephenturner commented 2 years ago

Should also add that you don't want to put a package namespace into code in the package itself. If you reference this as fiphde:::locations somewhere in one of the package's functions, rcmdcheck will throw a warning.

stephenturner commented 2 years ago

See https://r-pkgs.org/data.html#data-sysdata

chulmelowe commented 2 years ago

It isn't referenced with the package namespace anywhere in the package code. I was working through the clean up steps for the state data in the forecast_ili function so I could duplicate them for the NNETAR data, and got stuck because the scratch code I was writing couldn't figure out what locations was. I'm still learning package development, so this is good information to have. Thanks.

stephenturner commented 2 years ago

Hey @chulmelowe I saw a few commits on your ilipred branch a couple days back. Anything interesting to share here w.r.t. improvements to the ili forecast?

chulmelowe commented 2 years ago

It's on my to do list to pick it back up as soon as I can.

stephenturner commented 2 years ago

Thanks @chulmelowe. I made a couple changes just now on the forecast-api branch. I'm pretty sure everything in the submission directory will work, but not opening a PR just yet.

f1d3ba0: makes the ts_fit_forecast() function more flexible. You can specify which models you want (defaults to ARIMA + ETS, optionally you can get a neural net with nnetar.

6b77993: makes the plot_forecast function more flexible. Default .outcome is flu.admits. You can change this to weighted_ili if you want to plot an ILI forecast. Demo.

ilidat <- get_cdc_ili(region = c("national", "state", "hhs"),
                      years = 2017:lubridate::year(lubridate::today()))
ilidat_us <- 
  ilidat %>% 
  dplyr::filter(location %in% c("US", "12", "48", "51")) %>% 
  replace_ili_nowcast(weeks_to_replace=1) %>% 
  state_replace_ili_nowcast_all(state="FL") %>% 
  dplyr::mutate(week_end=week_start+6) %>% 
  make_tsibble(epiyear=epiyear, epiweek=epiweek)

ilifor <- ts_fit_forecast(ilidat_us, 
                          outcome="weighted_ili", 
                          covariates = NULL, 
                          constrained = FALSE,
                          trim_date = NULL,
                          horizon = 4,
                          models = c("arima", "ets"))

ilifor_formatted <- ts_format_for_submission(ilifor$tsfor)

plot_forecast(ilidat_us, ilifor_formatted$arima, .outcome="weighted_ili")
plot_forecast(ilidat_us %>% dplyr::filter(epiyear>=2021), ilifor_formatted$arima, .outcome="weighted_ili")

Some to-dos here

Similar to the mod to the plot function in 6b77993, we'll also want to add an outcome arg to wis_score and set actual_value={{.outcome}} similar to 6b77993, so that the wis functions can be used on weighted ILI forecasts.
Add some arguments to make ETS modeling more flexible. Currently it's hard-coded to nonseasonal. Can easily switch this wihth something like ~season(period="3 months"), and have the formula reformulated as it's being done for ARIMA.
Do the same with the neural net model
@chulmelowe - use some of the code we've developed previously to retrospectively go back over past weeks and estimate WIS at each horizon, avveraging over historical windows.

chulmelowe commented 2 years ago

I will work on adapting the code I wrote for FOCUS to look at the accuracy of historical predictions (a bit of an oxymoron there). I'm working on this between other tasks, so my progress is in fits and starts. I appreciate all the help.

chulmelowe commented 2 years ago

@stephenturner, the output from ts_format_for_submission still shows the target as hospitalizations when the target is given as weighted ILI.

vpnagraj commented 2 years ago

library(fiphde)

ilidat <- get_cdc_ili(region = c("national", "state", "hhs"),
                      years = 2017:lubridate::year(lubridate::today()))
ilidat_us <- 
  ilidat %>% 
  dplyr::filter(location %in% c("US", "12", "48", "51")) %>% 
  replace_ili_nowcast(weeks_to_replace=1) %>% 
  state_replace_ili_nowcast_all(state="FL") %>% 
  dplyr::mutate(week_end=week_start+6) %>% 
  make_tsibble(epiyear=epiyear, epiweek=epiweek)

ilifor <- ts_fit_forecast(ilidat_us, 
                          outcome="weighted_ili", 
                          covariates = NULL, 
                          constrained = FALSE,
                          trim_date = NULL,
                          horizon = 4,
                          models = c("arima", "ets"))

ilifor_formatted <- ts_format_for_submission(ilifor$tsfor)
ilifor_formatted$arima %>%
  head() %>%
  knitr::kable()

forecast_date	target	target_end_date	location	type	quantile	value
2022-02-07	1 wk ahead inc flu hosp	2022-02-12	12	point	NA	3
2022-02-07	2 wk ahead inc flu hosp	2022-02-19	12	point	NA	3
2022-02-07	3 wk ahead inc flu hosp	2022-02-26	12	point	NA	3
2022-02-07	4 wk ahead inc flu hosp	2022-03-05	12	point	NA	3
2022-02-07	1 wk ahead inc flu hosp	2022-02-12	12	quantile	0.010	2
2022-02-07	2 wk ahead inc flu hosp	2022-02-19	12	quantile	0.010	2

name in target column shouldnt matter. as long as it has the horizons in it we should be fine. not submitting these anywhere for now. we can add the plubming to conditionally change names or some kind of case_when() later by editing this part of the function:

https://github.com/signaturescience/fiphde/blob/forecast-api/R/submit.R#L139

much more significant thing we need to fix is the fact that the formatting function as-is rounds to whole numbers (because this was originally written for counts of number of hospitalizations):

https://github.com/signaturescience/fiphde/blob/forecast-api/R/submit.R#L150

ILI is % . that rounding is likely to really throw off any kind of evaluation we do.

stephenturner commented 2 years ago

These are both easy fixes.

stephenturner commented 2 years ago

Keeping it as character is actually important IIRC. I believe a2c8d58e26f7a3af33df9b49f0a8d32db2abd6c6 fixes both issues. You can now specify a .target (default is wk ahead inc flu hosp, so existing code shouldn't break), and a .counts argument (default TRUE to maintain existing code). If .counts=FALSE it won't round up. @vpnagraj might take a quick peek at a2c8d58e26f7a3af33df9b49f0a8d32db2abd6c6 just to make sure this looks good to you.

Demo.

library(fiphde)

ilidat <- get_cdc_ili(region = c("national", "state", "hhs"),
                      years = 2017:lubridate::year(lubridate::today()))
#> Latest week_start / year / epiweek available:
#> 2022-01-23 / 2022 / 4
ilidat_us <- 
  ilidat %>% 
  dplyr::filter(location %in% c("US", "12", "48", "51")) %>% 
  replace_ili_nowcast(weeks_to_replace=1) %>% 
  state_replace_ili_nowcast_all(state="FL") %>% 
  dplyr::mutate(week_end=week_start+6) %>% 
  make_tsibble(epiyear=epiyear, epiweek=epiweek)
#> Replacing weighted_ili with nowcast weighted_ili on dates: 2022-01-31, 2022-01-24

ilifor <- ts_fit_forecast(ilidat_us, 
                          outcome="weighted_ili", 
                          covariates = NULL, 
                          constrained = TRUE,
                          horizon = 4,
                          models = c("arima", "ets"))
#> Trimming to 2021-01-01
#> ARIMA  formula: weighted_ili ~ PDQ(0, 0, 0) + pdq(1:2, 0:2, 0)
#> ETS    formula: weighted_ili ~ season(method = "N")

ts_format_for_submission(ilifor$tsfor)$arima %>% 
  head() %>%
  knitr::kable()

forecast_date	target	target_end_date	location	type	quantile	value
2022-02-07	1 wk ahead inc flu hosp	2022-02-12	12	point	NA	3
2022-02-07	2 wk ahead inc flu hosp	2022-02-19	12	point	NA	3
2022-02-07	3 wk ahead inc flu hosp	2022-02-26	12	point	NA	3
2022-02-07	4 wk ahead inc flu hosp	2022-03-05	12	point	NA	3
2022-02-07	1 wk ahead inc flu hosp	2022-02-12	12	quantile	0.010	3
2022-02-07	2 wk ahead inc flu hosp	2022-02-19	12	quantile	0.010	3

ts_format_for_submission(ilifor$tsfor, .counts=FALSE, .target="wk ahead weighted ili")$arima %>% 
  head() %>%
  knitr::kable()

forecast_date	target	target_end_date	location	type	quantile	value
2022-02-07	1 wk ahead weighted ili	2022-02-12	12	point	NA	2.32747341535207
2022-02-07	2 wk ahead weighted ili	2022-02-19	12	point	NA	2.32747341535207
2022-02-07	3 wk ahead weighted ili	2022-02-26	12	point	NA	2.32747341535207
2022-02-07	4 wk ahead weighted ili	2022-03-05	12	point	NA	2.32747341535207
2022-02-07	1 wk ahead weighted ili	2022-02-12	12	quantile	0.010	2.32747341535207
2022-02-07	2 wk ahead weighted ili	2022-02-19	12	quantile	0.010	2.32747341535207

vpnagraj commented 2 years ago

@stephenturner can you double check your edits. all the values are coming out to exactly the same after this change.

vpnagraj commented 2 years ago

my hunch is something is haywire in the vectorization here:

https://github.com/signaturescience/fiphde/blob/forecast-api/R/submit.R#L153

stephenturner commented 2 years ago

Oh jeez. I honestly don't know why that doesn't work. I believe I fixed this in f171236. Also, I moved those mutates up above the split so we aren't mapping those mutates over the list of tibbles, and we're mutating the combined tibble directly before the split.

library(fiphde)

ilidat <- get_cdc_ili(region = c("national", "state", "hhs"),
                      years = 2017:lubridate::year(lubridate::today()))
#> Latest week_start / year / epiweek available:
#> 2022-01-23 / 2022 / 4
ilidat_us <- 
  ilidat %>% 
  dplyr::filter(location %in% c("US", "12", "48", "51")) %>% 
  replace_ili_nowcast(weeks_to_replace=1) %>% 
  state_replace_ili_nowcast_all(state="FL") %>% 
  dplyr::mutate(week_end=week_start+6) %>% 
  make_tsibble(epiyear=epiyear, epiweek=epiweek)
#> Replacing weighted_ili with nowcast weighted_ili on dates: 2022-01-31, 2022-01-24

ilifor <- ts_fit_forecast(ilidat_us, 
                          outcome="weighted_ili", 
                          covariates = NULL, 
                          constrained = TRUE,
                          horizon = 4,
                          models = c("arima", "ets"))
#> Trimming to 2021-01-01
#> ARIMA  formula: weighted_ili ~ PDQ(0, 0, 0) + pdq(1:2, 0:2, 0)
#> ETS    formula: weighted_ili ~ season(method = "N")

ts_format_for_submission(ilifor$tsfor)$arima %>% 
  head() %>%
  knitr::kable()

forecast_date	target	target_end_date	location	type	quantile	value
2022-02-07	1 wk ahead inc flu hosp	2022-02-12	12	point	NA	3
2022-02-07	2 wk ahead inc flu hosp	2022-02-19	12	point	NA	3
2022-02-07	3 wk ahead inc flu hosp	2022-02-26	12	point	NA	3
2022-02-07	4 wk ahead inc flu hosp	2022-03-05	12	point	NA	3
2022-02-07	1 wk ahead inc flu hosp	2022-02-12	12	quantile	0.010	2
2022-02-07	2 wk ahead inc flu hosp	2022-02-19	12	quantile	0.010	1

ts_format_for_submission(ilifor$tsfor, .counts=FALSE, .target="wk ahead weighted ili")$arima %>% 
  head() %>%
  knitr::kable()

forecast_date	target	target_end_date	location	type	quantile	value
2022-02-07	1 wk ahead weighted ili	2022-02-12	12	point	NA	2.32747341535207
2022-02-07	2 wk ahead weighted ili	2022-02-19	12	point	NA	2.1946860728647
2022-02-07	3 wk ahead weighted ili	2022-02-26	12	point	NA	2.28923576457636
2022-02-07	4 wk ahead weighted ili	2022-03-05	12	point	NA	2.41699298635913
2022-02-07	1 wk ahead weighted ili	2022-02-12	12	quantile	0.010	1.46756237093079
2022-02-07	2 wk ahead weighted ili	2022-02-19	12	quantile	0.010	0.431724096232226

vpnagraj commented 2 years ago

nice. thanks for fixing that.

related: now im wondering i that "bound at zero" bit at the very end is even doing anything ?!

https://github.com/signaturescience/fiphde/blob/forecast-api/R/submit.R#L165

move up ahead of the split() with the other mutate() calls?

chulmelowe commented 2 years ago

I think I've got working code for computing the counterfactuals and assessing the WIS. I'm running it to test it now. Might not have results for the meeting today, but should have them soon. At the moment I'm only looking at the ARIMA, ETS, ensemble, and NNETAR models, but the modeling is being done through the ts_fit_forecast function, so we can easily play with parameter spaces to vary these models.

stephenturner commented 2 years ago

nice. thanks for fixing that.

related: now im wondering i that "bound at zero" bit at the very end is even doing anything ?!

https://github.com/signaturescience/fiphde/blob/forecast-api/R/submit.R#L165

move up ahead of the split() with the other mutate() calls?

Good call on that one. I think it worked, but for clarity moved it up above the split in 87732ac

chulmelowe commented 2 years ago

I've been doing some work on the testing matrix this afternoon, and a thought occurred to me regarding the data sets we want to look at. The two data sets we've discussed so far (2010 to the Present and 2020 to the Present) both include the COVID-19 pandemic. I think it might also be interesting to look at the models' comparative performance on a data set that doesn't include the pandemic, 2009 to 2019, for example. Do you think this would be worth examining?

stephenturner commented 2 years ago

I think the answer is "yes", but I'm not sure how. We don't yet know the full picture of the 21-22 flu season, and we won't know what the 22-23 and 23-24 seasons will look like, but I imagine they'll look much more like the pre-COVID seasons than they'll ever look like the 20-21 season.

The easy thing to do would be to compare model performance eg from 2009-present, or 2009-2019. I imagine the latter will have better retrospective performance, because a model trained on 2009-X is going to have garbage performance on the 20-21 season.

If you limit a TS model to 2020-present, you're not going to learn anything about historical seasonality patterns because the last two years has been so atypical.

If you assume that 21-22 will look more like pre-2020 seasons, I'd almost imagine that the best performance might come from a model that was trained on 200x-2019 data, skipping 20-21. I don't know. I also am not sure mechanically about how to implement this.

vpnagraj commented 2 years ago

yeah @chulmelowe i think that is a good thought.

agreed with @stephenturner that there be some nuance here ... especially in how we evaluate the trained models (i.e., do we evaluate a 2009-2019 trained model on all weeks from say 2020-21 ? or, picking another range out of hat, 2018-2021?)

but that ^ to me is an evaluation question that we'll need to tackle either way.

i think the most complete course of action would be to include models trained with just pre C19 pandemic ILI (2009-2019) as a row (or rows depending on if we want to test this for multiple approaches) in the test matrix.

chulmelowe commented 2 years ago

I gave some more thought to what data we should evaluate the models' performance over after our meeting yesterday. From a pragmatic standpoint, what we're interested in is model performance on this season's data, at least for now. (I still think there's an outstanding question about whether this season will be "normal" and, if it isn't, whether models that work well this season will still perform well in a normal season, but we can possibly address that later.) Therefore, at least for the initial analysis, I would like to propose we use data from 10/4/2021 through 2/7/2022. That will give us 15 weeks over which to forecast results.

I also think it would be wise to extract the data sets we're going to use and save them somewhere to make sure the results are easily reproducible. I don't know how much revision is done from week to week of previous ILI data, but unless we're certain there's none, we should curate a dataset for testing. I know Git isn't necessarily a great place to host data sets, so if anyone has another idea of where I could but the curated testing data, I'm all ears.

vpnagraj commented 2 years ago

@chulmelowe that mostly makes sense to me.

one important thing to consider is that the 10/04/2021 - 02/07/2022 range wont include any spring dates.

i do see what you're saying with evaluating using only data from this season .

but my opinion. since we'll be practically using these models through 05/2022 then i think we should evaluate ILI models / forecasts on all of the weeks of the season ... even if that evaluation uses data from a previous season.

two possible ways of incorporating the all season weeks in the evaluation:

a) 02/07/2021 - 02/07/2022 b) 02/07/2021 - 05/16/2021, and 10/04/2021-02/07/2022 (effectively masking non flu season, summer weeks)

i would lean towards option a since the CDC actually does report ILI during those "non season" weeks (https://gis.cdc.gov/grasp/fluview/fluportaldashboard.html)

as to the question about storing data for this analysis. should be MB (maybe even KB) of data. will be fine in git. that said, i dont think this repository is the best home for all of this analysis / intermediate data, etc. ill set up another repo for us to use.

stephenturner commented 2 years ago

see discussion in #104 and #103 and signaturescience/iliforceval#1

chulmelowe commented 2 years ago

@vpnagraj that's a good point, I hadn't thought about the need to include the spring months in the evaluation. I'd also vote for option a. I feel like that's a more naturalistic model for the data we're trying to predict. It might be a bit wonky because of COVID, but I think it'll work.

stephenturner commented 2 years ago

What's the resolution here wrt https://github.com/signaturescience/iliforceval/issues/1? Based on the last data I saw are we closing this and continuing to use the same ILI forecasts we've been using?

I might want to split this off into another issue either here or on fiphde-auto/fct. I think it'd be useful to save a plot of the ILI forecasts in the submission artefacts, if not to show on the fct somewhere.

vpnagraj commented 2 years ago

@stephenturner heads up i'm tracking that ILI forecast visualization feature for the FCT already. can show you how thats handled in the app on another thread ...

stephenturner commented 2 years ago

I added some code in https://github.com/signaturescience/fiphde-auto/commit/23a604673e35534d0232b1e1de42692266b6c855 to look at how our ILI forecasts looked since we've been saving those artifacts. Plots show each week's forecasts against the truth data as pulled today.

US: ili-forc-us.pdf

ili-forc-us

States: ili-forc-states.pdf

ili-forc-states

The national ILI forecasts are actually pretty good! States are a mixed bag. Aside from a few notable egregious errors (KS early, NJ early, NM early, NY, OR, OK, TX early), most aren't too bad! I think it's pretty clear at least from these plots that there very few cases of the ILI forecasts severely under forecasting ILI, so that can't be an explanation for low hosp forecasts on the CREG model.

stephenturner commented 2 years ago

Closing this for now, I don't think there's much to be done here.

signaturescience / fiphde

improved ILI forecasting #89