DeepAR and Deep State Space Model

AIAficionado commented 4 years ago

Hi all,

Firstly, thanks for providing this library for experimental purposes! I am familiar with DL/ML models but recently introduced to time series forecasting.

I have used various techniques in other libraries such as Hierarchial Bayesian Structural Time Series, this specifically allows to share the statistical strength from multiple time series for a global model. I wanted to implement DeepAR and Deep State Space models.

However, I am stuck on the implementation it via the library gluonts:

Am I supposed to include the other time series as a dynamical categorical feature in the key value for ListDataset module or on target field in GluonTS?

Place the multiple time series in target key value as a list in the target field or training_data = ListDataset( [{"start": df.index[0], "target": [df.ts0[:69],df.ts1[:69],df.ts2[:69]], freq = "5min" )
Place the multiple time series as dynamical feature? training_data = ListDataset( [{"start": df.index[0], "target": [df.ts0[:69]],"dynamical_feat": [df.ts1[:69],df.ts2[:69]]}] freq = "5min" )
How can I implement the multiple time series in a Deep State package model?

Technical Questions

DeepAR - requires the multiple time series with future time steps besides the global model?
DeepStateSpace - does not require the multiple time series to have future time steps right?

jaschau commented 4 years ago

Hi @AlAficionado, at least for DeepAR, if you have a multivariate time series with N time series of length T, target should have shape [N, T]. During training time, feat_dynamic_real can also have shape [N, T]. During test time, however, target should have shape [N, T] whereas feat_dynamic_real should have shape [N, T + prediction_length].

The difference stems from the fact that during test-time, future values of target are assumed to be unknown whereas future values of feat_dynamic_real are assumed to be known.

AIAficionado commented 4 years ago

Hi @AlAficionado, at least for DeepAR, if you have a multivariate time series with N time series of length T, target should have shape [N, T]. During training time, feat_dynamic_real can also have shape [N, T]. During test time, however, target should have shape [N, T] whereas feat_dynamic_real should have shape [N, T + prediction_length].

The difference stems from the fact that during test-time, future values of target are assumed to be unknown whereas future values of feat_dynamic_real are assumed to be known.

Thanks for the clarification.

If you had multiple time series, that require predictions and have some dependencies across the time series that you would like to model and require predictions for all time series, would you label the other time series as list in target because you can't label them as dynamic_feat_real as you don't know future data?

jaschau commented 4 years ago

Yes, if you require predictions of all time series then including all of them in target as [ts1,..., tsN] (which, as a numpy array has shape [N, T]) would be the correct approach.

wasimahmadpk commented 4 years ago

I am working on similar problem. I tried to forecast multiple related time series by training a global model. When I combined all the time series inside target. It report dimension problem.

Error: "raise GluonTSDataError( gluonts.core.exception.GluonTSDataError: Array 'target' has bad shape - expected 1 dimensions, got 2."

l-cdc commented 4 years ago

I am trying as well to get DeepState to work with a multidimensional dataset.

@wasimahmadpk Try using ListDataset(..., one_dim_target=False) when you create your dataset

I am using a N x T shape as recommended, but training fails, stacktrace is here It works with a unidimensional target, everything else unchanged. Any advice?

EDIT: Just to follow up on this, it seems that DeepState in GluonTS 0.5.2 does not work with multidimensional targets. (The same dataset works with GPVAR, so I assume this is a model issue)

wasimahmadpk commented 4 years ago

@l-cdc I am doing it like this and it works fine now.`

train_dataset = ListDataset( data_iter=[ { "start": "2019-01-01 00:00:00", "target": [timeseries1[start:train_stop], timeseries2[start:train_stop]],
}, ], freq=freq, one_dim_target=False )

estimator = DeepAREstimator( 'H', prediction_length=prediction_length, trainer=Trainer(epochs=epochs, hybridize=False), distr_output=MultivariateGaussianOutput(dim=2), `

awslabs / gluonts

DeepAR and Deep State Space Model #694