signaturescience / fiphde

Forecasting Influenza in Support of Public Health Decision Making
https://signaturescience.github.io/fiphde/
GNU General Public License v3.0
3 stars 1 forks source link

forecast ili data #7

Closed stephenturner closed 2 years ago

stephenturner commented 2 years ago

The functions implemented in #3 bring in ILI data from ILINET. Note that this data is never up to date with the most recent weeks. We'll need to forecast ILI data at least a few weeks in advance if we want to use this as a predictor in any kind of GLM for hospitalization.

See some of the focustools utils and forecasting functions.

vpnagraj commented 2 years ago

heads up i started working on this over at fluforce-init before i realized there was a scratch dir in this repo.

heres where i am so far:

https://github.com/signaturescience/fluforce-init/blob/main/fiphde.R

can use the auto ARIMA to model => forecast ILI variable. then can pass that in a list of models to use in the glm framework.

code is very thinly documented. some values hardcoded in there. a lot more work to do. but wanted you to be aware so we dont duplicate effort.

stephenturner commented 2 years ago

I added some of the utils from focustools, but updated, in #8, withj more flexibility. make_tsibble used to require epiyear and epiweek hard-coded variable names. It's more flexible.

https://github.com/signaturescience/fiphde/blob/0e4635e204da0c8a9c75f0d386ceaceb6ef64091/R/utils.R#L16-L22

You no longer need to mutate in the epiyear/week (https://github.com/signaturescience/fluforce-init/blob/f8b985c0c6eb5b96bcf32f9f7a568e0265ec50c6/fiphde.R#L17-L18) because the cdcfluview query brings that in (as year and week)

> library(fiphde)
> ilidat <- get_cdc_ili(region="national", years=2019:2021)
Latest week_start / year / epiweek available:
2021-12-05 / 2021 / 49
> ilidat %>% make_tsibble(epiyear=year, epiweek=week, chop=FALSE)
# A tsibble: 115 x 15 [1W]
# Key:       location [1]
   location region_type abbreviation region  year  week monday        yweek week_start weighted_ili unweighted_ili ilitotal
   <chr>    <chr>       <chr>        <chr>  <int> <int> <date>       <week> <date>            <dbl>          <dbl>    <dbl>
 1 US       National    US           US      2019    40 2019-09-30 2019 W40 2019-09-29         1.49           1.50    21916
 2 US       National    US           US      2019    41 2019-10-07 2019 W41 2019-10-06         1.59           1.60    22954
 3 US       National    US           US      2019    42 2019-10-14 2019 W42 2019-10-13         1.73           1.74    24886
 4 US       National    US           US      2019    43 2019-10-21 2019 W43 2019-10-20         1.83           1.86    27419
 5 US       National    US           US      2019    44 2019-10-28 2019 W44 2019-10-27         2.04           2.01    28910
 6 US       National    US           US      2019    45 2019-11-04 2019 W45 2019-11-03         2.39           2.36    34586
 7 US       National    US           US      2019    46 2019-11-11 2019 W46 2019-11-10         2.63           2.64    37732
 8 US       National    US           US      2019    47 2019-11-18 2019 W47 2019-11-17         2.90           2.97    44161
 9 US       National    US           US      2019    48 2019-11-25 2019 W48 2019-11-24         3.42           3.45    44044
10 US       National    US           US      2019    49 2019-12-02 2019 W49 2019-12-01         3.26           3.31    48904
# … with 105 more rows, and 3 more variables: num_of_providers <dbl>, total_patients <dbl>, population <dbl>
stephenturner commented 2 years ago

Running through that code, some of the "weird this works need fix in focustools" bits... Pete you and I should chat about whether we want to try to fix focustools and import that package, or (my preference (I think?)) copy over functionality we need and fix it here.