Probabilistic Forecasts for Gaussian Processes

Hello everyone :)

I'm currently working on a project where I need to generate probabilistic forecasts for my time series data using the darts library. Specifically, I'm comparing the performance of several models, including GaussianProcessRegressor from scikit-learn, and I need to obtain estimates for quantiles from these models as we need really conservative estimates, so we'd like to consider the 1%-percentile instead of the expected value of the model.

However, I am facing challenges with generating probabilistic forecasts using GaussianProcessRegressor within the Darts framework. While scikit-learn's GaussianProcessRegressor has the return_std parameter in its predict method to obtain standard deviations, this functionality isn't directly accessible when using it via Darts' RegressionModel class.

What I Have Tried So Far

Direct Use of RegressionModel Class to wrap the GaussianProcessRegressor, but as expected, it doesn't support probabilistic forecasting out of the box.
manually construct the covariance matrix using the L and kernel attributes of the trained model. However, this approach was complicated by the structure of TimeSeriesobjects in Darts, and I couldn't get it to work correctly.
GaussianProcessFilter: I looked into Darts' GaussianProcessFilter, but it seems to be designed for filtering rather than forecasting.
Custom GaussianProcessForecasterclass, similar to Darts' KalmanFilter and KalmanForecaster, but I'm not sure how to proceed with this.

I am restricted to using Gaussian Processes for this project and the estimated quantiles/standard deviations for the predictions made by GaussianProcessRegressor. I am indifferent to calculating them analytically or via resampling etc.

Is there an existing way to generate probabilistic forecasts with GaussianProcessRegressorin Darts that I might have missed or can anyone provide guidance or an example on how to create a GaussianProcessForecasterclass in Darts? Are there alternative approaches or libraries that could help me achieve my goals while still fitting within the constraints of my project? I tried sklearn and skforecast but could not get them to work with GPs. Any help or suggestions would be greatly appreciated. Thank you very much in advance!

Best regards :)

Hi @dzbanker. Yes, it's possible to do this but requires some additional code.

What is required:

Subclass from RegressionModel for regression model logic, and use the _LikelihoodMixin for probabilistic forecasting support
Make use of "gaussian" likelihood (we have some existing code that works for the CatBoostModel, we can use parts of this plus some dedicated logic for the GaussianProcessRegressor (GPR))
Overwrite relevant private methods from _LikelihoodMixin, to add support for the GPR
this will allow to use GPR the same way as any other regression model that uses a likelihood (e.g. sampling with num_samples>1, or direct likelihood parameter predictions (mu, std) with predict_likelihood_parameters).

So below is an example of how you can achieve this.

First, how you can create a custom Darts GPRegressor that has support for both probabilistic forecasts using sampling and direct parameter predictions. It is pretty much copy-pasted code from other models, just with some minor adaptions to make it work with the GPR. I added some comments here and there to clarify some things.

from typing import List, Optional
import numpy as np

from darts import TimeSeries
from darts.models.forecasting.regression_model import (
    FUTURE_LAGS_TYPE,
    LAGS_TYPE,
    RegressionModel,
    _LikelihoodMixin,
)

class GPRegressor(RegressionModel, _LikelihoodMixin):
    def __init__(
        self,
        lags: Optional[LAGS_TYPE] = None,
        lags_past_covariates: Optional[LAGS_TYPE] = None,
        lags_future_covariates: Optional[FUTURE_LAGS_TYPE] = None,
        output_chunk_length: int = 1,
        output_chunk_shift: int = 0,
        add_encoders: Optional[dict] = None,
        random_state: Optional[int] = None,
        multi_models: Optional[bool] = True,
        use_static_covariates: bool = True,
        **kwargs,
    ):
        self.kwargs = kwargs

        # gaussian process can use parts of our pre-defined `"gaussian"` / `"normal"` likelihood
        self._likelihood = "gaussian"

        # parse likelihood
        self._rng = np.random.default_rng(seed=random_state)

        super().__init__(
            lags=lags,
            lags_past_covariates=lags_past_covariates,
            lags_future_covariates=lags_future_covariates,
            output_chunk_length=output_chunk_length,
            output_chunk_shift=output_chunk_shift,
            add_encoders=add_encoders,
            model=GaussianProcessRegressor(**kwargs),
            multi_models=multi_models,
            use_static_covariates=use_static_covariates,
        )

    # defines logic how to predict assuming the normal / gaussian likelihood
    # (all copy-pasted, only parts withing ===> START / END of adapted code <=== is new
    def _predict_normal(
        self,
        x: np.ndarray,
        num_samples: int,
        predict_likelihood_parameters: bool,
        **kwargs,
    ) -> np.ndarray:
        k = x.shape[0]

        # ====> START OF ADAPTED CODE <====
        # here we add `return_std` to `predict()` to return the standard deviation

        # model_output shape:
        # if univariate & output_chunk_length = 1: (num_samples, 2)
        # else: (2, num_samples, n_components * output_chunk_length)
        # where the axis with 2 dims is mu, sigma
        model_output = self.model.predict(x, return_std=True, **kwargs)
        model_output = np.concatenate([
            np.expand_dims(model_output[0], 0),
            np.expand_dims(model_output[1], 0)
        ], axis=0)
        output_dim = len(model_output.shape)
        if output_dim == 2:
            model_output = model_output.T
        # ====> END OF ADAPTED CODE <====

        # deterministic case: we return the mean only
        if num_samples == 1 and not predict_likelihood_parameters:
            # univariate & single-chunk output
            if output_dim <= 2:
                output_slice = model_output[:, 0]
            else:
                output_slice = model_output[0, :, :]
            return output_slice.reshape(k, self.pred_dim, -1)

        # probabilistic case
        # univariate & single-chunk output
        if output_dim <= 2:
            # embedding well shaped 2D output into 3D
            model_output = np.expand_dims(model_output, axis=0)

        else:
            # we transpose to get mu, sigma couples on last axis
            # shape becomes: (n_components * output_chunk_length, num_samples, 2)
            model_output = model_output.transpose()

        # shape (n_components * output_chunk_length, num_samples, 2)
        return model_output

    # use pre-defined `normal/gaussian` likelihood for sampling / parameters
    def _predict_and_sample(
        self,
        x: np.ndarray,
        num_samples: int,
        predict_likelihood_parameters: bool,
        **kwargs,
    ) -> np.ndarray:
        return self._predict_and_sample_likelihood(
            x, num_samples, "normal", predict_likelihood_parameters, **kwargs
        )

    # get parameters in case of `predict_likelihood_parameters=True`
    def _params_normal(self, model_output: np.ndarray) -> np.ndarray:
        """[mu, sigma] on the last dimension, grouped by component"""
        n_samples = model_output.shape[1]
        # (n_components * output_chunk_length, num_samples, 2) -> (num_samples, n_components * output_chunk_length, 2)
        params_transposed = np.swapaxes(model_output, 0, 1)
        # (num_samples, n_components * output_chunk_length, 2) -> (num_samples, output_chunk_length, n_components * 2)
        return params_transposed.reshape(n_samples, self.pred_dim, -1)

    # required to add probabilistic support
    @property
    def supports_probabilistic_prediction(self) -> bool:
        return True

    # add the component names for direct parameter predictions
    def _likelihood_components_names(
            self, input_series: TimeSeries
    ) -> Optional[List[str]]:
        """Override of RegressionModel's method to support the gaussian/normal likelihood"""
        return self._likelihood_generate_components_names(
            input_series, ["mu", "sigma"]
        )

And then we can train the model and generate some forecasts:

import matplotlib.pyplot as plt
import darts.utils.timeseries_generation as tg
from darts.datasets import AirPassengersDataset
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import DotProduct, WhiteKernel

series = AirPassengersDataset().load()

# define a Darts GPRegressor that uses a kernel
kernel = DotProduct() + WhiteKernel()
model = GPRegressor(
    lags=12,
    output_chunk_length=12,
    kernel=kernel  # GaussianProcessRegressor params
)
model.fit(series)

# forecasting example with sampling
preds = model.predict(n=12, num_samples=1000)

series[-48:].plot()
preds.plot()
plt.show()

# forecasting example with direct parameter (mu, sigma) prediction
preds = model.predict(n=12, num_samples=1, predict_likelihood_parameters=True)
series[-48:].plot()
preds.plot()
plt.show()

Results:

Hope it helps and good luck with the project!

unit8co / darts

Probabilistic Forecasts for Gaussian Processes #2470