cmu-delphi / epiforecast-R

R package to implement and visualize several epidemiological forecasting methods.
GNU General Public License v2.0
21 stars 5 forks source link

Adaptive Weighting Function #8

Open yijunwang0805 opened 3 years ago

yijunwang0805 commented 3 years ago

Hi,

Thank you for your wonderful package!

In Brooks' 2018 paper "Nonmechanistic forecasts of seasonal influenza with iterative one-week-ahead distributions" and 2020 "Pancasting: forecasting epidemics from provisional data", he mentioned that he uses an adaptive weighting/ a stacking approach to ensemble the forecast models.

I looked through the manual epiforecast 0.0.1 but could not find the function for adaptive weighting. Could you please shed some light on it?

Best regards, Yijun

brookslogan commented 3 years ago

natreg-forecasts.R provides an (involved) example of forming adaptively weighted ensembles like those in the paper. In natreg-config.R you can see references to a few types that are compared:

e.ensemble.partial.weighting.scheme.wgt.indexer.lists = list(  ## "constant-weights" = list(all=NULL, all=NULL, all=NULL),  "target-based" = list(all=NULL, all=NULL, each=NULL),  "target-3time-based" = list(smear=-1:1, all=NULL, each=NULL),  "target-9time-based" = list(smear=-4:4, all=NULL, each=NULL)) %>>% with_dimnamesnames("Ensemble weighting scheme")

The wgt part refers to the fact that the elements of these partial CV indexer lists refer to the week of the season, "epigroup" (in this setup, this refers to the location), and the target being modeled. Here, target-based fits an ensemble based on all weeks of season from all locations for each target separately, while target-9time-based considers only data from up to 4 weeks away (smear=-4:4), from all locations (all=NULL), for each target separately (each=NULL), the RelevanceWeight scheme used in the paper (weights are always fit separately for different metrics in these scripts --- each=NULL is added later).

As a warning, cv_apply is a currently a bit rigid and awkward to use (and needs examples), so it may not be the best for experimentation with new weighting schemes. In particular, assigning a gradation of weights rather than just picking out certain instances to train on, as well as dealing with potentially-missing training instances, will require extra coding effort.