The forecast evaluation dashboard provides a robust set of tools and methods for evaluating the performance of epidemic forecasts. The project's goal is to help epidemiological researchers gain insights into the performance of their forecasts and lead to more accurate epidemic forecasting.
This app collects and scores COVID-19 forecasts submitted to the CDC. The dashboard was developed by CMU Delphi in collaboration with the Reich Lab and US COVID-19 Forecast Hub from UMass-Amherst, as part of the Forecast Evaluation Research Collaborative.
The Reich Lab created and maintains the COVID-19 Forecast Hub, a collaborative effort with over 80 groups submitting forecasts to be part of the official CDC COVID-19 ensemble forecast. All Forecase Hub forecasters that are designated "primary" or "secondary" are scored and included in the dashboard.
The Delphi Group created and maintains COVIDcast, a platform for epidemiological surveillance data. COVIDcast provides the ground truth data used to score forecasts against.
The public version of the dashboard runs off of the main
branch.
The version on the dev
branch appears on the staging website. The username and password are included in the meeting notes doc and on Slack.
The dashboard is backed by the forecast evaluation pipeline. The pipeline runs three times a week, on Sunday, Monday, and Tuesday, using the code on the dev
branch. It collects and scores forecasts from the Forecast Hub, and posts the resulting files to a publicly-accessible AWS S3 bucket.
See the "About" writeup for more information about the data and processing steps.
main
is the production branch and shouldn't be directly modified. Pull requests should be based on and merged into dev
. When enough changes have accumulated on dev
, a release will be made to sync main
with it.
This project requires a recent version of GNU make
and docker.
The easiest way to view and develop this project locally is to run the Shiny app from RStudio:
This is the same as running
shiny::runApp("<directory>")
in R. However, dashboard behavior can differ running locally versus running in a container (due to package versioning, packages that haven't been properly added to the container environment, etc), so the dashboard should be also tested in a container.
The dashboard can be run in a Docker container using make
. See notes in the Makefile for workarounds if you don't have image repository access.
The pipeline can be run locally with the Report/create_reports.R
script or in a container. See notes in the Makefile for workarounds if you don't have image repository access.
The scoring pipline use a containerized R environment. See the docker_build
directory for more details.
The pipeline can be run locally with the Report/create_reports.R
script or in a container via
> make score_forecast
See notes in the Makefile for workarounds if you don't have image repository access.
The dashboard can be run in a Docker container using
> make start_dashboard
See notes in the Makefile for workarounds if you don't have image repository access.
main
is the production branch and contains the code that the public dashboard uses. Code changes will accumulate on the dev
branch and when we want to make a release, dev
will be merged into main
via the "Create Release" workflow. Version bump type (major, minor, etc) is specified manually when running the action.
If there's some issue with the workflow-based release process, a release can be done manually with:
git checkout dev
git pull origin dev
git checkout -b release_v<major>.<minor>.<patch> origin/dev
Update version number in the DESCRIPTION file and in the dashboard.
git add .
git commit -m "Version <major>.<minor>.<patch> updates"
git tag -a v<major>.<minor>.<patch> -m "Version <major>.<minor>.<patch>"
git push origin release_v<major>.<minor>.<patch>
git push origin v<major>.<minor>.<patch>
Create a PR into main
. After the branch is merged to main
, perform cleanup by merging main
into dev
so that dev
stays up to date.
The scoring pipeline runs in a docker container built from docker_build/Dockerfile
, which is a straight copy of the covidcast-docker
image. The dashboard runs in a docker container built from devops/Dockerfile
.
When updates are made in the evalcast
package the behavior of the scoring script can be affected and the covidcast
docker image must be rebuilt. The workflow in the covidcast-docker
repository that does this needs to be triggered manually. Before building the new image, ensure that the changes in evalcast
will be compatible with the scoring pipeline.
Currently, the scoring pipeline uses the the evalcast
package from theevalcast
branch of the covidcast
repository. However, if we need to make forecast eval-specific changes to the evalcast
package that would conflict with other use cases, we have in the past created a dedicated forecast-eval branch of evalcast
.
This should only be performed if absolutely necessary.
forecasteval
line to point to the desired (most recently working) sha256 hash rather than the latest
tag. The hashes can be found in the Delphi ghcr.io image repository -- these require special permissions to view. Ask Brian for permissions, ask Nat for hash info.main
. Tag Brian as reviewer and let him know over Slack. Changes will automatically propagate to production once merged.latest
image to the public dashboard; the tag in the ansible
settings file must be manually changed back to latest
.FROM
line in the docker_build
Dockerfile to point to the most recently working sha256 hash rather than the latest
tag. The hashes can be found in the Delphi ghcr.io image repository -- these require special permissions to view. Ask Brian for permissions, ask Nat for hash info.dev
. Tag Katie or Nat as reviewer and let them know over Slack. Changes will automatically propagate to production once merged.covidcast
docker image, changes will no longer automatically propagate via the latest
covidcast
image to the local pipeline image; the tag in docker_build/Dockerfile
must be manually changed back to latest
..github
workflows
contains GitHub Actions workflow files
ci.yml
runs linting on branch merge. Also builds new Docker images and pushes to the image repo for the main
and dev
branchescreate_release.yml
triggered manually to merge dev
into main
. Increments app version number, and creates PR into main
and tags reviewer (currently Katie).release_main.yml
runs on merge of release branch. Creates tagged release using release-drafter.yml
and merges updated main
back into dev
to keep them in sync.s3_upload_ec2.yml
runs the weekly self-hosted data pipeline workflow action (preceded by s3_upload.yml
that ran the pipeline on a GitHub-provided VM)release-drafter.yml
creates a releaseReport
contains the code for fetching, scoring, and uploading forecasts. Runs 3 times a weekapp
contains all the code for the Shiny dashboard
R
contains supporting R functions
data.R
defines data-fetching functionsdata_manipulation.R
defines various filter functionsdelphiLayout.R
defines dashboard main and sub- UIsexportScores.R
contains tools to support the score CSV download tool included in the dashboardassets
contains supporting Markdown text. about.md
contains the code for the "About" tab in the dasboard; other .md files contain explanations of the scores and other text info that appears in the app.www
contains CSS stylesheets and the logo imagesui.R
sets up the UI for the dashboard, and defines starting values for selectorsserver.R
defines dashboard behavior. This is where the logic for the dashboard lives.global.R
defines constants and helper functionsdocker_buid
contains the Docker build configuration for the scoring pipelinedevops
contains the Docker build configuration for the Shiny dashboard
DESCRIPTION
summarizes package information, such as contributors, version, and dependenciesMakefile
contains commands to build and run the dashboard, and score and upload the data