cmu-delphi / exploration-tooling

tools for evaluating and exploring forecasters
Other
0 stars 0 forks source link

Exploration Tooling

This repo is for exploring forecasting methods and tools for both COVID and Flu. The repo is structured as a targets project, which means that it is easy to run things in parallel and to cache results. The repo is also structured as an R package, which means that it is easy to share code between different targets.

Usage

Define run parameters in your .Renviron file:

EPIDATR_USE_CACHE=true
# Choose a cache timeout for yourself. We want a long cache time, since we work with historical data.
EPIDATR_CACHE_MAX_AGE_DAYS=42
DEBUG_MODE=false
USE_SHINY=false
TAR_PROJECT=covid_hosp_explore
EXTERNAL_SCORES_PATH=legacy-exploration-scorecards.qs
AWS_S3_PREFIX=exploration

Run the pipeline using:

# Install renv and R dependencies
make install

# Pull pre-scored forecasts from the AWS bucket
make pull
# or
make download

# Run only the dashboard, to display results run on other machines
make dashboard

# Run the pipeline using the helper script `run.R`
make run
# or in the background
make run-nohup

# Push complete or partial results to the AWS bucket
make push
# or
make upload

Development

Directory Layout

Debugging

Targets in parallel mode conflicts with debugging because it ignores browser() statements. To debug a target named yourTarget:

  1. set DEBUG_MODE=true in .Renviron
  2. insert a browser in the relevant function
  3. run an R session and call tar_make(yourTarget)

Pipeline Design

See this diagram. Double diamond objects represent "plates" (to evoke plate notation, but don't take the comparison too literally), which are used to represent multiple objects of the same type (e.g. different forecasters).

Notes on Forecaster Types

Basic

The basic forecaster takes in an epi_df, does some pre-processing, does an epipredict workflow, and then some post-processing

Ensemble

This kind of forecaster has two components: a list of existing forecasters it depends on, and a function that aggregates those forecasters.

(to be named)

Any forecaster which requires a pre-trained component. An example is a forecaster with a sophisticated imputation method. Evaluating these has some thorns around training/testing splitting. It may be foldable into the basic variety though.