awslabs / gluonts

Probabilistic time series modeling in Python
https://ts.gluon.ai
Apache License 2.0
4.55k stars 747 forks source link

More flexible Forecast types #289

Closed lostella closed 3 years ago

lostella commented 5 years ago

Models can either produce point forecasts or probabilistic forecasts. While the former has an obvious representation in terms of arrays, the latter can be implemented in many ways.

Currently, probabilistic forecasts are implemented through samples in the SampleForecast class. However, some model will probably want to output parametric distributions (with no sampling involved).

Therefore we could have:

AbstractForecast
|
+--- PointForecast
|
+--- ProbabilisticForecast

PointForecast will just wrap an array of the appropriate size, a timestamp and a frequency. ProbabilisticForecast will just wrap a Distribution, a timestamp (the point in time where the prediction start) and a frequency.

I believe that structuring different forecasts like this is rather natural, and avoids potentially growing unnecessarily redundant, parallel hierarchies (forecasts vs distributions).

FAQ

What are the benefits of this?

Sampling can be expensive, especially if several samples are needed (e.g. to accurately estimate the tails of the distribution). If a model produces a parametric distribution as forecast, forcing it to give samples instead can limit its applicability.

What are the downsides to this?

The Distribution classes handle mxnet arrays, while some predictors and other components work directly on numpy arrays. For example, the SampleForecast class handles both. We probably need to take into account for this in coming up with the design.

Where does the current SampleForecast type fit in this hierarchy?

A sample population is just an empirical distribution, about which one can compute the usual suspects: mean, standard deviation, quantiles, and more. We can think of fitting samples in the Distribution hieararchy with an EmpiricalDistribution class: what is currently SampleForecast will then be a ProbabilisticForecast wrapping an EmpiricalDistribution.

lostella commented 5 years ago

This was partially addressed in #316. @vafl do you think it's realistic to eventually have just one type of probabilistic forecasts based on distributions? That is, get rid of QuantileForecast and SampleForecast in favor of having correspondent Distribution types, and only relying on DistributionForecast

vafl commented 5 years ago

Yes, why not. Some of the methods such as log_prob will not be available for these kind of distributions, but that's not a big deal.