awslabs / gluonts

Probabilistic time series modeling in Python
https://ts.gluon.ai
Apache License 2.0
4.47k stars 744 forks source link

Issue regarding target dimension . #1055

Closed parimuns closed 3 years ago

parimuns commented 3 years ago

Hello I am trying to implement LSTNet for multivariate time-series with cardinality 8.I am attaching the code below and the error. Please let me know, where am I doing wrong.I referred to issue #713 too.But I guess that is for one-dimensional time-series or if I have misunderstood, please correct me.

[data.zip](https://github.com/awslabs/gluon-ts/files/5288528/data.zip)

import mxnet as mx
import gluonts
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import json
import os
from itertools import islice
from pathlib import Path

from gluonts.dataset.field_names import FieldName

[f"FieldName.{k} = '{v}'" for k, v in FieldName.__dict__.items() if not k.startswith('_')]

load_train = pd.read_csv('data.csv',parse_dates=[['Date', 'Time']],header=0,index_col=0)
load_train.head()

train=load_train.transpose()
train.head()

from gluonts.dataset.common import ListDataset

train2=train.to_numpy()
type(train2)
train2.shape

feat_static_cat=train2[[0,2,4,6,8,10,12,14],0]
feat_static_cat.shape

feat_dynamic_real=train2[[1,3,5,7,9,11,13,15],:]
feat_dynamic_real.shape

target=train2[[0,2,4,6,8,10,12,14],:]
target.shape

freq='1H'
prediction_length=2
start= [pd.Timestamp("2017-01-01", freq='1H') 
                                for _ in range(8)]

train_ds = ListDataset([{FieldName.TARGET: target, 
                         FieldName.START: start,
                         FieldName.FEAT_DYNAMIC_REAL: [fdr],
                         FieldName.FEAT_STATIC_CAT: [fsc]} 
                        for (target, start,fdr,fsc) in zip(target[:, :-prediction_length], 
                                                            start, 
                                                             feat_dynamic_real[:, :-prediction_length], 
                                                             feat_static_cat)],
                      freq=freq)

from gluonts.model.lstnet import LSTNetEstimator
from gluonts.trainer import Trainer

estimator=LSTNetEstimator(freq='H', prediction_length=24, context_length=24, num_series=8, skip_size=2, ar_window=2, channels=4, lead_time=0, kernel_size = 2, trainer = gluonts.trainer._base.Trainer(batch_size=32, clip_gradient=10.0, ctx=None, epochs=1, hybridize=True, init="xavier", learning_rate=0.001, learning_rate_decay_factor=0.5, minimum_learning_rate=5e-05, num_batches_per_epoch=50, patience=10, weight_decay=1e-08), dropout_rate = 0.2, output_activation = None, rnn_cell_type = 'gru', rnn_num_cells = 100, rnn_num_layers = 3, skip_rnn_cell_type = 'gru', skip_rnn_num_layers = 1, skip_rnn_num_cells = 10, scaling = True)

predictor = estimator.train(train_ds)'''

I get error by running this

GluonTSDataError                          Traceback (most recent call last)
<ipython-input-15-44c508c2f597> in <module>()
----> 1 predictor = estimator.train(train_ds)

15 frames
/usr/local/lib/python3.6/dist-packages/gluonts/core/exception.py in assert_gluonts(exception_class, condition, message, *args, **kwargs)
    140     """
    141     if not condition:
--> 142         raise exception_class(message.format(*args, **kwargs))
    143 
    144 

GluonTSDataError: Input for field "target" does not have the requireddimension (field: target, ndim observed: 1, expected ndim: 2)
lostella commented 3 years ago

@parimuns what does target.shape evaluate to? (The snippet is not runnable)

parimuns commented 3 years ago

The target.shape is this example is (8,125). I have tested this snippet again, please use data.csv file. It will run.

harusametime commented 3 years ago

@parimuns A solution in https://github.com/awslabs/gluon-ts/issues/713 would work, but we need to set max_target_dim=8 for cardinality 8. When I add the following to your code, training successfully starts.

from gluonts.dataset.multivariate_grouper import MultivariateGrouper
grouper_train = MultivariateGrouper(max_target_dim=8)
train_ds = grouper_train(train_ds)

スクリーンショット 2020-09-29 2 14 17

parimuns commented 3 years ago

@harusametime.Thanks a lot.Yup, training starts. But then how to evaluate forecasts with

import mxnet as mx
import gluonts
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import json
import os
from itertools import islice
from pathlib import Path

from gluonts.dataset.field_names import FieldName

[f"FieldName.{k} = '{v}'" for k, v in FieldName.__dict__.items() if not k.startswith('_')]

from google.colab import drive
drive.mount('/content/drive')

load_train = pd.read_csv(r'/content/drive/My Drive/Colab Notebooks/ISO-NE with temp.csv',parse_dates=[['Date', 'Time']],header=0,index_col=0)
load_train.head()

train=load_train.transpose()
train.head()

from gluonts.dataset.common import ListDataset

train2=train.to_numpy()
type(train2)
train2.shape

feat_static_cat=train2[[0,2,4,6,8,10,12,14],0]
feat_static_cat.shape

feat_dynamic_real=train2[[1,3,5,7,9,11,13,15],:]
feat_dynamic_real.shape

target=train2[[0,2,4,6,8,10,12,14],:]
target.shape

freq='1H'
prediction_length=2
start= [pd.Timestamp("2017-01-01", freq='1H') 
                                for _ in range(8)]

train_ds = ListDataset([{FieldName.TARGET: target, 
                         FieldName.START: start,
                         FieldName.FEAT_DYNAMIC_REAL: [fdr],
                         FieldName.FEAT_STATIC_CAT: [fsc]} 
                        for (target, start,fdr,fsc) in zip(target[:, :-prediction_length], 
                                                            start, 
                                                             feat_dynamic_real[:, :-prediction_length], 
                                                             feat_static_cat)],
                      freq=freq)

from gluonts.model.lstnet import LSTNetEstimator
from gluonts.trainer import Trainer

estimator=LSTNetEstimator(freq='H', prediction_length=24, context_length=24, num_series=8, skip_size=2, ar_window=2, channels=4, lead_time=1, kernel_size = 9, trainer = gluonts.trainer._base.Trainer(batch_size=32, clip_gradient=10.0, ctx=None, epochs=1, hybridize=True, init="xavier", learning_rate=0.001, learning_rate_decay_factor=0.5, minimum_learning_rate=5e-05, num_batches_per_epoch=50, patience=10, weight_decay=1e-08), dropout_rate = 0.2, output_activation = 'sigmoid', rnn_cell_type = 'lstm', rnn_num_cells = 100, rnn_num_layers = 3, skip_rnn_cell_type = 'lstm', skip_rnn_num_layers = 1, skip_rnn_num_cells = 10, scaling = True)

from gluonts.dataset.multivariate_grouper import MultivariateGrouper

grouper_train = MultivariateGrouper(max_target_dim=8)

train_ds = grouper_train(train_ds)

predictor = estimator.train(train_ds)

# from pathlib import Path
# predictor.serialize(Path(r'/content/drive/My Drive/Colab Notebooks/deepstate_iso_temp_48h'))

# from gluonts.model.predictor import Predictor
# predictor.deserialize = Predictor.deserialize(Path(r'/content/drive/My Drive/Colab Notebooks/deepstate_iso_temp_48h'))

from gluonts.evaluation.backtest import make_evaluation_predictions

test_ds = ListDataset([{FieldName.TARGET: target, 
                        FieldName.START: start,
                        FieldName.FEAT_DYNAMIC_REAL: [fdr],
                        FieldName.FEAT_STATIC_CAT: [fsc]} 
                       for (target, start,fdr,fsc) in zip(target, 
                                                            start, 
                                                            feat_dynamic_real, 
                                                            feat_static_cat)],
                     freq=freq)

forecast_it, ts_it = make_evaluation_predictions(
    dataset=test_ds,  # test dataset
    predictor=predictor,  # predictor
    num_samples=100,  # number of sample paths we want for evaluation
)

forecasts = list(forecast_it)
tss = list(ts_it)
'''
# And then I want to plot the forecasts -
def plot_prob_forecasts(ts_entry, forecast_entry):
plot_length = 150
prediction_intervals = (90.0, 95.0)
legend = ["observations", "median prediction"] + [f"{k}% prediction interval" for k in prediction_intervals][::-1]

fig, ax = plt.subplots(1, 1, figsize=(10, 7))
ts_entry[-plot_length:].plot(ax=ax)  # plot the time series
forecast_entry.plot(prediction_intervals=prediction_intervals, color='r')
plt.grid(which="both")
plt.legend(legend, loc="upper left")
plt.show()

plot_prob_forecasts(tss[0], forecasts[0])

from gluonts.evaluation import Evaluator

evaluator = Evaluator(quantiles=[0.1,0.2,0.5,0.6,0.7,0.8,0.9,0.95])
agg_metrics, item_metrics = evaluator(iter(tss), iter(forecasts), num_series=len(test_ds))

print(json.dumps(agg_metrics, indent=4))

 #x=item_metrics.head(18)
 df=item_metrics

'''

skybullet1987 commented 3 years ago

Got the same issue. I was only able to create a multivariate model with DeepAR. Having same issues with LSTNet and GP - trains fine, but can't really use it. Would really love to see a working example or to understand what is wrong.

harusametime commented 3 years ago

@parimuns When testing, we need to use MultivariateGrouper as well. Try to put the following snippet before performing make_evaluation_predictions.

grouper_test = MultivariateGrouper(max_target_dim=8)
test_ds = grouper_test(test_ds)

スクリーンショット 2020-09-30 4 57 32

parimuns commented 3 years ago

@harusametime .Thanks a lot. I am sorry to bother you again.

While using tss[0] and forecasts[0], when I am plotting the probabilistic forecasts, I am not getting "Quantile forecasts " as shown in the figure below. result

And when I use the evaluator


       agg_metrics, item_metrics = evaluator(iter(tss), iter(forecasts), num_series=len(test_ds)) '''

then it gives me error 
 `ValueError: operands could not be broadcast together with shapes (8,24) (24,8)
harusametime commented 3 years ago

I am not familiar with a smart way of getting quantile forecasts in GluonTS. In a straightforward way, because forecasts has 100 samples of forecasts as specified in num_samples=100 in make_evaluation_predictions, we can calculate quantile forecasts by NumPy operations.

First check the shape of forecasts.

forecasts[0].samples.shape

returns (100, 24, 8), which are corresponding to (#samples, prediction_length, #variables).

We take 100 samples on the first variable.

values = forecasts[0].samples[:,:,0]

Then we can get 0.8-quantile forecast by numpy.quantile.

import numpy as np
np.quantile(values, q=0.8, axis = 0)

To obtain quantile forecasts over all variables at one time, run the following:

np.quantile(forecasts[0].samples, q=0.8, axis = 0)
parimuns commented 3 years ago

@harusametime ..Thanks a lot for your help.

dai-ichiro commented 3 years ago

For plot

from gluonts.evaluation.backtest import make_evaluation_predictions

forecast_it, ts_it = make_evaluation_predictions(
    dataset=test_ds,  
    predictor=predictor,  
    num_samples=100
    )

for x, y  in zip(ts_it, forecast_it):
    for i in range(8):
        plt.subplot(8,1,i+1)
        x[i].plot()
        y.copy_dim(i).plot(color='g', prediction_intervals=(50.0, 90.0))

plt.show()

For evaluation

from gluonts.evaluation.backtest import make_evaluation_predictions

forecast_it, ts_it = make_evaluation_predictions(
    dataset=test_ds,  
    predictor=predictor,  
    num_samples=100
    )

from gluonts.evaluation import MultivariateEvaluator

evaluator = MultivariateEvaluator(quantiles=[0.1, 0.5, 0.9])
agg_metrics, item_metrics = evaluator(ts_it, forecast_it, num_series=len(test_ds))

print(json.dumps(agg_metrics, indent=4))
print(item_metrics)

item_metrics.plot(x='MSIS', y='MASE', kind='scatter', c=item_metrics.index, cmap='Accent')
plt.grid(which="both")
plt.show()
parimuns commented 3 years ago

@dai-ichiro ..Thanks a lot .I tried this code and plotting part is working fine but when I use the evaluation part, it is giving me error as -


evaluator = MultivariateEvaluator(quantiles=[0.1, 0.5, 0.9])
agg_metrics, item_metrics = evaluator(ts_it, forecast_it, num_series=len(test_ds))

print(json.dumps(agg_metrics, indent=4))
print(item_metrics)

item_metrics.plot(x='MSIS', y='MASE', kind='scatter', c=item_metrics.index, cmap='Accent')
plt.grid(which="both")
plt.show()

`---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-35-2b6685eec6ad> in <module>()
      1 evaluator = MultivariateEvaluator(quantiles=[0.1, 0.5, 0.9])
----> 2 agg_metrics, item_metrics = evaluator(ts_it, forecast_it, num_series=len(test_ds))
      3 
      4 print(json.dumps(agg_metrics, indent=4))
      5 print(item_metrics)

1 frames
/usr/local/lib/python3.6/dist-packages/gluonts/evaluation/_base.py in peek(iterator)
    648     @staticmethod
    649     def peek(iterator: Iterator[Any]) -> Tuple[Any, Iterator[Any]]:
--> 650         peeked_object = iterator.__next__()
    651         iterator = chain([peeked_object], iterator)
    652         return peeked_object, iterator

StopIteration: `
dai-ichiro commented 3 years ago

Run this code again before evaluation.

forecast_it, ts_it = make_evaluation_predictions(
    dataset=test_ds,  
    predictor=predictor,  
    num_samples=100
    )
parimuns commented 3 years ago

@dai-ichiro ... It's working...Thanks a lot.

YrliCl commented 2 years ago

@dai-ichiro ... It's working...Thanks a lot.

how did you finally solve it about 'Quantile forecasts'