awslabs / gluonts

Probabilistic time series modeling in Python
https://ts.gluon.ai
Apache License 2.0
4.55k stars 747 forks source link

DeepAR for multivariate time series #303

Closed FadhelA closed 4 years ago

FadhelA commented 5 years ago

Hello all,

I am trying to use DeepAR for multivariate time series forecasting. I am generating a bivariate time series, where each coordinate is independently sampled from a normal distribution. The first coordinate have mean 0 and variance 1, the second mean zero and variance 25. The algorithm seems to restrict the covariance matrix to lambda*Identity. I also tried the same code with the SimpleFeedForward model, and it works fine.

import pandas as pd
import numpy as np

from gluonts.dataset.common import ListDataset
from gluonts.model.deepar import DeepAREstimator
from gluonts.trainer import Trainer

from gluonts.distribution.multivariate_gaussian import MultivariateGaussianOutput

from gluonts.evaluation.backtest import make_evaluation_predictions

# Generate data 

N = 20  # number of time series
T = 1000  # number of timesteps
dim = 2 # dimension of the observations
prediction_length = 25
freq = '1H'

custom_datasetx = np.random.normal(size=(N, dim, T))
custom_datasetx[:,1,:] = 5*custom_datasetx[:,1,:]
start = pd.Timestamp("01-01-2019", freq=freq)

train_ds = ListDataset(
    [
        {'target': x, 'start': start}
        for x in custom_datasetx[:, :, :-prediction_length]
    ],
    freq=freq,
    one_dim_target=False,
)

test_ds = ListDataset(
    [
        {'target': x, 'start': start}
        for x in custom_datasetx[:, :, :]
    ],
    freq=freq,
    one_dim_target=False,
)

# Deep AR 

# Trainer parameters
epochs = 10
learning_rate = 1E-3
batch_size = 5
num_batches_per_epoch = 100

# create estimator
estimator = DeepAREstimator(
    prediction_length=prediction_length,
    context_length=prediction_length,
    freq=freq,
    trainer=Trainer(
        ctx="cpu",
        epochs=epochs,
        learning_rate=learning_rate,
        hybridize=True,
        batch_size=batch_size,
        num_batches_per_epoch=num_batches_per_epoch,
    ),
    distr_output=MultivariateGaussianOutput(dim=dim)
)

predictor = estimator.train(train_ds)

forecast_it, ts_it = make_evaluation_predictions(
    dataset=test_ds,  # test dataset
    predictor=predictor,  # predictor
    num_eval_samples=100,  # number of sample paths we want for evaluation
)

forecasts = list(forecast_it)
tss = list(ts_it)

Here are the forecasts for each coordinate

Coordinate 1 (Normal(0,1)) image

Coordinate 2 (Normal(0,5)) image

geoalgo commented 5 years ago

Hi @FadhelA currently multivariate is not supported for DeepAR. We are aiming at releasing soon the code of High-dimensional multivariate forecasting with low-rank Gaussian Copula Processes which will enable for multivariate (dependent or independent).

lostella commented 4 years ago

This looks like a shape issue: along the time axis, forecasts alternate between having variance 1 and 5, some axes must be inverted somewhere.

nicolasignaciopinocea commented 4 years ago

Hello, i studying the deepAR algorithm, i will like understand the output of model. The percentile value are confidence interval? or credibility interval? why the first is a approach frequentist and second a approa ch bayesian. The second question is about if i can get the value of variable that are predicting or just i can see information of the probability distribucion

xiaoyaoyang commented 4 years ago

@geoalgo Hello !, Just read the paper about High-dimensional multivariate forecasting with low-rank Gaussian Copula Processes , are there plans to release this?

mbohlkeschneider commented 4 years ago

Hi @xiaoyaoyang,

It's there already: https://github.com/awslabs/gluon-ts/tree/master/src/gluonts/model/gpvar

An example how to run it is here.

He have not tested the model in master extensively yet. If you want to reproduce the paper results, please use this branch (or use master and let us know your results :-).)

ghost commented 4 years ago

Isn't that the entire idea of deepAR?

Hi @FadhelA currently multivariate is not supported for DeepAR. We are aiming at releasing soon the code of High-dimensional multivariate forecasting with low-rank Gaussian Copula Processes which will enable for multivariate (dependent or independent).

https://docs.aws.amazon.com/sagemaker/latest/dg/deepar.html

In many applications, however, you have many similar time series across a set of cross-sectional units. For example, you might have time series groupings for demand for different products, server loads, and requests for webpages. For this type of application, you can benefit from training a single model jointly over all of the time series.

mbohlkeschneider commented 4 years ago

Hi @hanan-vian,

there is a subtle difference between a global model like DeepAR and a truly multivariate model. You are right that DeepAR is a global model in the sense that it learns across time series. However, it is not multivariate in the sense that it only learns parameters for univariate probability distributions. A multivariate model, in contrast, is able to learn the parameters of a truly multivariate distribution (like a multivariate gaussian). Hope that clarifies this!

ghost commented 4 years ago

Hi @mbohlkeschneider, Do you mean DeepAR is M:1 and true multivariate is M:M? So if I have M:M scenario I cannot learn one model with DeepAR but instead I need M models each of which is M:1 ?

mbohlkeschneider commented 4 years ago

Hi @mbohlkeschneider, Do you mean DeepAR is M:1 and true multivariate is M:M? So if I have M:M scenario I cannot learn one model with DeepAR but instead I need M models each of which is M:1 ?

I'm not sure I understand this comment. Could you clarify this?

ghost commented 4 years ago

Input to Output Notation: 1:M one to many, M:1 many to one, 1:1 one to one, M:M many to many.

mbohlkeschneider commented 4 years ago

Input to Output Notation: 1:M one to many, M:1 many to one, 1:1 one to one, M:M many to many.

I see. The important part is on how the model does inference: DeepAR can only do: 1:1 even though parameters are learned jointly. GPVAR does M:M during inference. Hope that clarifies.

pratikgehlott commented 3 years ago

can deepAR be used with M:1?

yitao-yu commented 3 years ago

Hi! There are things I don't understand about your comments in this issue @mbohlkeschneider, I know it has been quite some times and I would appreciate it if you can clarify them.

However, it is not multivariate in the sense that it only learns parameters for univariate probability distributions.

I think I get this. What I understand about DeepAR is that it's an RNN(LSTM, GRU whatever) giving point estimates not only on the target variable but also its variance. So it's univariate on outputs, I agree.

A multivariate model, in contrast, is able to learn the parameters of a truly multivariate distribution (like a multivariate gaussian).

I always thought of this feature to be a special character of a Bayesian Model(actually able to output a posterior distribution from a prior which can be any of the variables) I don't quite understand what you say about DeepAR that it is not multivariate... I mean it can take multiple time series as input just like a regular RNN, right? And Gluonts supports that by including "feat_dynamic_real".

I have this understanding because when I read the DeepAR paper, there are z and x, both are vector-notated.

yitao-yu commented 3 years ago

Hi! There are things I don't understand about your comments in this issue @mbohlkeschneider, I know it has been quite some times and I would appreciate it if you can clarify them.

However, it is not multivariate in the sense that it only learns parameters for univariate probability distributions.

I think I get this. What I understand about DeepAR is that it's an RNN(LSTM, GRU whatever) giving point estimates not only on the target variable but also its variance. So it's univariate on outputs, I agree.

A multivariate model, in contrast, is able to learn the parameters of a truly multivariate distribution (like a multivariate gaussian).

I always thought of this feature to be a special character of a Bayesian Model(actually able to output a posterior distribution from a prior which can be any of the variables) I don't quite understand what you say about DeepAR that it is not multivariate... I mean it can take multiple time series as input just like a regular RNN, right? And Gluonts supports that by including "feat_dynamic_real".

I have this understanding because when I read the DeepAR paper, there are z and x, both are vector-notated.

cccjjjfff commented 1 year ago

multivariate

Could you please provide this example now? Because this link has expired