epiforecasts / stackr

Model stacking for predictive ensembles
http://epiforecasts.io/stackr/
MIT License
4 stars 3 forks source link
crps ensembles forecasting stacking

stackr package

GitHub R package
version R-CMD-check codecov GitHub
contributors License:
MIT

Overview

The stackr package provides an easy way to combine predictions from individual time series or panel data models to an ensemble. stackr stacks models according to the Continuous Ranked Probability Score (CRPS) over k-step ahead predictions. It is therefore especially suited for time-series and panel data. A function for leave-one-out CRPS may be added in the future. Predictions need to be predictive distributions represented by predictive samples. Usually, these will be sets of posterior predictive simulation draws generated by an MCMC algorithm.

Installation

Install using

devtools::install_github("epiforecasts/stackr")

CRPS Stacking

Given some training data with true observed values as well as predictive samples generated from different models, stackr finds the optimal (in the sense of minimizing expected cross-validation predictive error) weights to form an ensemble of these models. Using these weights, stackr can then provide samples from the optimal model mixture by drawing from the predictive samples of those models in the correct proportion. This gives a mixture model solely based on predictive samples and is in this regard superior to other ensembling techniques like Bayesian Model Averaging. More information can be found in the package vignette.

Weights are generated using the crps_weights function. With these weights and predictive samples, the mixture_from_samples function can be used to obtain predictive samples from the optimal mixture model.

Usage

Load example data and split into train and test data

splitdate <- as.Date("2020-03-28")
traindata <- example_data[date <= splitdate]
testdata <- example_data[date > splitdate]

Get weights and create mixture

weights <- crps_weights(traindata)
test_mixture <- mixture_from_samples(testdata, weights = weights)

Score predictions

library("scoringutils")

# combine data.frame with mixture with predictions from other models
score_df <- rbindlist(list(testdata, test_mixture), fill = TRUE)

# score all predictions using from github.com/epiforecasts/scoringutils
score_df[, crps := crps(unique(observed), t(predicted)),
  by = .(geography, model, date)
]

# summarise scores
score_df[, mean(crps), by = model][, setnames(.SD, "V1", "CRPS")]

References

Contributors

All contributions to this project are gratefully acknowledged using the allcontributors package following the all-contributors specification. Contributions of any kind are welcome!

Code

nikosbosse, sbfnk, seabbs

Issues

jonathonmellor