Interpretable forecast uncertainty

noamross commented 1 year ago

I am interested in discussing and working on ways to quantify and attribute sources of uncertainty in forecasts and ways to communicate these components. There are a wide variety of both model-agnostic and model-specific methods to break down which variables influence a prediction and how. I'd like to have a similar toolbox to unpack why forecast intervals are narrow or wide or why expected performance varies. Consider these non-exclusive statements one might make to explain a range of forecast values:

1) "We have good predictions of a typical outcome but individual outcomes may vary greatly." Prediction intervals much greater than confidence intervals. 2) "Historically these conditions have produced a wide variety of outcomes" - Uncertainty that can measured or attributed to the data, such as high uncertainty predicted in location-scale models. 3) "We are uncertain forecasting that far out." - Uncertainty due to long forecast horizons (autocorrelative models, in a way another location-scale type) 4) "These conditions are outside of our experience" - Uncertainty due to extrapolation outside the training data. Distance Indices or Area of Applicability (paper, R implementations) measures provide one way to match current conditions to training data. 5) "These conditions are rare so we have less certainty in the outcome." Data may be within the historical bounds but in a sparse range. A high-dimensional density estimate (like those produced with vine copulas), might be good for quantifying how much similar training data was available. 6) "We are uncertain because of uncertainty in a data (or other input) source." Training or prediction data may be uncertain itself and propagation of that uncertainty can be calculated by analytic or monte carlo techniques. 7) "These conditions are challenging for our models." - Structural model uncertainty, such as conditions not accounted for in model selection. Limitations in parametric or mechanistic models. This may not be expressed in model predictions by may be detected or quantified by poor performance metrics, possibly within certain ranges of data.

Can we quantitatively or qualitatively break down which statements would be most relevant for a particular forecast? Can we automatically generate explanation values along with forecasts? It would be interesting to discuss, review, or work on ways to convey the dominant uncertainty sources. Are there equivalent model-agnostic methods like prediction attribution one might use for uncertainty breakdown? How might one present a breakdown or highlight the underlying conditions driving uncertainty? One need not neccessarily have a quantitative estimate of how these elements compose uncertainty. For instance, where would it make sense to show the prediction data point alongside an Area of Applicability measure to convey whether the forecast should be trusted? Or historical model performance within a similar range of data (e.g., "Predictions under similar conditions were correct 70% of the time.")?

Some angles to consider:

Spatial, temporal, or other prediction types (How should Area of Applicability work with longitudinal/temporal models?)
What methods are most amenable to such breakdowns (e.g. location-scale), and what breakdown approaches are model-agnostic
Continuous forecasts vs. categorical/binary forecasts where the prediction is a probability
Issues specific to ensembles
Visualizations and interactive components
Empirical evidence of how uncertainty is interpreted (Discussion thread with sources)

Some possible activities or outcomes:

A resource collection/literature review on current methods and approaches
Some example code-throughs and literate code documents testing approaches. (I'm pretty interested in trying things out with density-based approaches in (5) above)

Some things it would help to know:

What are some data sets with (multiple) already-implemented forecasts that we could use to explore this topic? (I expect EFI has some canonical examples!)
Has this been touched on in other EFI or associated work?

noamross commented 1 year ago

Slightly related to #6 and the uncertainty section on p171 of the NASA Biological Diversity and Ecological Forecasting state of knowledge report.

mdietze commented 1 year ago

The EFI Theory working group is definitely thinking about this issue (e.g., Lewis et al 2023. “The power of forecasts to advance ecological theory” Methods in Ecology and Evolution 14(3):746-756 http://doi.org/10.1111/2041-210X.13955 ) and there's also a larger literature out there on uncertainty quantification. My own perspective (Dietze, M. 2017. Prediction in ecology: a first-principles framework. Ecological Applications 27: 2048–2060. http://doi.org/10.1002/eap.1589) has definitely had an influence on both the NASA report and EFI (e.g. in addition to the Theory group, it also shows up in the EFI output standards).

In terms of the "how", I initially approached this using simple one-at-a-time approaches to uncertainty partitioning, but have been exploring more "global" approaches more recently (e.g. Sobol-style designs).

As for the question "What are some data sets with (multiple) already-implemented forecasts that we could use to explore this topic?" that one's easy -- the EFI-RCN NEON forecast challenge https://ecoforecast.org/efi-rcn-forecast-challenges/

ChrisJones687 commented 1 year ago

This is a great topic. @EliHorner has been working on using Sobol methods for our pathogen forecast for Sudden Oak Death. We would be happy to contribute to this topic. Ideally, we could do a vignette on how to do Sobol for one of the forecast challenges and one of doing a spatially-explicit model.

noamross commented 1 year ago

@ChrisJones687 I'd be happy work on this and return to SOD, which was my dissertation topic!

robbinscalebj commented 1 year ago

In line with Mike's comment, the Theory group has automated some machine learning forecasts across the NEON Challenges (using and extending the forecast template referenced elsewhere by @cboettig ; https://github.com/eco4cast/neon4cast-example; Issue #18) that don't really have much of any uncertainty propagation except for running trained models on NOAA ensembles. Think there's opportunity to try out probably a few routes for incorporating more uncertainty sources. Not sure how much I'm adding to the above discussion, but in the framework of our existing forecasts, it could be an easy extension to add better/diverse types of uncertainty quantification and have those forecasts to analyze for the questions above.

noamross commented 1 year ago

@mdietze One area we might be able to discuss is standard schema/formats for forecast outputs that allow uncertainty partitioning along the lines of your 2017 paper.

rqthomas commented 1 year ago

@noamross This is included in the EFI standards that are our paper accepted in Ecosphere. The pre-print is here: https://doi.org/10.32942/osf.io/9dgtq.

noamross commented 1 year ago

@rqthomas Thanks! I'd be keen to work through an example of how to represent Sobol-type or one-at-a-time uncertainty breakdowns in this format.

cboettig commented 1 year ago

@robbinscalebj cool! it would be fun to think more about uncertainty in ML model examples too (though that may diverge from the task of making uncertainty 'interpretable'). it may be interesting to look at the NOAA models as analogy actually -- my understanding is that those are essentially deterministic process-based simulations, so all uncertainty in ensembles comes only from perturbing initial conditions; something similar is perhaps plausible with the ML techniques you mention.

Also @mlap will be at the unconf, and had a recent paper looking at some ML forecasting methods which are already designed to provide probabilistic predictions, and can be enticed to give quite broad uncertainty predictions. These rely on using likelihood functions as the targets I think, e.g. see example notebook and this list of ML models which support probabilistic prediction. (these examples all use the darts framework in python which we've found remarkably user-friendly.

noamross commented 1 year ago

We've got a number of people interested in this topic and a bunch of potential projects/outputs! Let me summarize some of the ideas we could work on next week:

A vignette demonstrating Sobol methods for partitioning uncertainty, possibly drawing from @ChrisJones687 and @EliHorner's work on Sudden Oak Death
- Possibly contrasting this with one-at-a-time approaches a la @mdietze's https://doi.org/10.1002/eap.1589
Working through how to express the outputs of these under EFI standards per https://doi.org/10.32942/osf.io/9dgtq so that uncertainty partitioning can be used downstream, and consider extensions of the framework for this
Relatedly, figuring out how to incorporate uncertainty in addition to those that propagated in predictor ensembles into the out puts of the theory groups ML forecasting framework, per @robbinscalebj
Creating a review/collection (somewhere between a literature review and Task View-type resource) of papers and resources for partitioning uncertainty, as well as communicating and visualizing how uncertainty breaks down and varies across conditions.

Given the interest, there's probably more that one working group to form! I do think it will be useful to spend some initial time mapping out different parts of the problem space all together.

eco4cast / unconf-2023

Interpretable forecast uncertainty #9