Enhance integration of Global and Local models.

davide-burba commented 3 weeks ago

Is your feature request related to a current problem? Please describe.

When using a mixture of local and global models, the user needs to distinguish the model types.

Here's a list of practical examples:

When calling the fit method, local models don't support lists of TimeSeries.
Ensembles support a mixture of local and global models when calling the historical_forecasts method, but not when calling the fit method.
It's not clear if global models are effectively trained on multiple time-series when using the historical_forecasts method, especially when using ensembles.

Describe proposed solution

Add support for multiple time-series on local models. Under the hood, independent models should be trained.
Allow to fit and predict ensembles of mixtures of local/global models.
Provide a single interface wrapper to call fit, predict, and historical_forecasts on any kind of model. Under the hood, the interface should assign the correct args to fit each model, and raise an error if some args are missing, and possibly raise a warning if some args are unused.
Bonus: it would be cool to have a factory that receives the name of the model and a serializable dict of args to create instances of models, or even ensembles.

dennisbader commented 3 weeks ago

Hi @davide-burba, and thanks for writing.

There are a few design choices that we made in Darts:

local models only support single series. Why don't we allow multiple series? By definition, the global models can be trained on a list of time series, and then the pre-trained model can be used to generate forecasts for any new series (single series, multiple series (any number), even unobserved series). This would not work with the local models as:
- we would have to fit a model for each series and store them in the object.
- it's only possible to generate forecasts if the list of input series passed to predict() exactly matches the ones used when calling fit(). This is in contradiction to our global models that can forecast any series -> the unified API would not hold in this case either

For these reasons, and to avoid any unexpected behavior, we believe it's best to not extend the models to work on multiple series.

Ensembles support a mixture of local and global models when calling the historical_forecasts method, but not when calling the fit method.

Regarding historical forecasts (ensemble or not): Currently, with retrain != False the models are trained separately on single series. This is why it works for historical forecasts. We have it in our backlog #1538 to add support for letting global models be trained on multiple series in historical forecasts. It's not a trivial one as discussed here.

It's not clear if global models are effectively trained on multiple time-series when using the historical_forecasts method, especially when using ensembles.

NaiveEnsembleModel: The model support is given by the underlying forecasting models. If all of them are global, then it supports multiple series. If at least one is local, it only supports single series. fit() will train only the the forecasting models (the same way as calling model.fit() for each model). Since we only take the average of all predictions, the ensemble model is not trained at all.
RegressionEnsembleModel: calling fit() will (optionally) train the forecasting models and the regression ensemble model. The ensemble model is trained jointly on the forecasts from all forecasting models. There are two ways how the forecasts can be generated under the hood: Using historical_forecasts, or direct predictions (auto-regressive for n>output_chunk_length).
- train_using_historical_forecasts=True and train_forecasting_models=True: since we train it with historical forecasts (acting on single series with retrain=True), fit() supports a mixture of local and global models and multiple series.
- train_using_historical_forecasts=True and train_forecasting_models=False only works with pre-trained global models. The ensemble supports multiple series.
- train_using_historical_forecasts=False and train_forecasting_models=True: If all of them are global, then it supports multiple series. If at least one is local, it only supports single series.

davide-burba commented 3 weeks ago

Hi @dennisbader, thank you for the explanations :)

I see your point about having a unified API that allows to predict unseen time-series with global models, which indeed wouldn't work for local models.

However, I still think that a wrapper storing one local model per time-series is a valid alternative. There are already several consistency checks in the "pipeline", such as:

ensuring that the covariates are consistent for global time-series
forbid autoregressive predictions if there are covariates
various checks on the mixture of local/global models and the way they are trained (historical_forecasts vs regular fit), as you described in the second part of your message

For this reason, I think that an additional consistency check to verify if the local models wrapper is predicting a time-series on which it was trained on would not be a big issue, and it might even simplify things down the line for ensembling.

Anyway, this is just my impression/feedback, I haven't checked the code and the implementation might not be trivial.

unit8co / darts

Enhance integration of Global and Local models. #2403