awslabs / gluonts

Probabilistic time series modeling in Python
https://ts.gluon.ai
Apache License 2.0
4.64k stars 755 forks source link

DeepVARHierarchicalEstimator returning AssertionError: Argument lhs must have NDArray type #1967

Closed rhameysayed closed 2 years ago

rhameysayed commented 2 years ago

Description

I am trying to fit a hierarchical VAR model. The dataset (dollar_sales_hier) has 70 time-series, with 324 weekly observations, which combine into 10 categories, and one grand total - there are 81 time series. Each row in the dataset is a time series and the columns are indexed by date. I have made sure the dimensions of my matrices are similar, specifically for the summation matrix and the data set. I double-checked my input dataset and summation matrix by looking here: https://github.com/rshyamsundar/gluonts-hierarchical-ICML-2021/tree/master/experiments/data/tourism . Also, as shown in the paper here (https://www.amazon.science/publications/end-to-end-learning-of-coherent-probabilistic-forecasts-for-hierarchical-time-series) the S matrix (s_mat) is [S_sum | I (70x70)]

To Reproduce

import pandas as pd
import numpy as np
from gluonts.dataset.common import ListDataset
from gluonts.model.deepvar_hierarchical import DeepVARHierarchicalEstimator
from gluonts.mx.trainer import Trainer

groups = list('ABCDEFGHIJ')
group_list = list('ABCDEFGHIJ'*7)
time_line = pd.date_range("2016-01-10",freq= 'w',periods = 324)
dollar_sales_hier_rand = pd.DataFrame(np.random.randint(0,300,size=(70, 324)))
dollar_sales_hier_rand.columns = time_line
dollar_sales_hier_rand["Group"] = group_list
dollar_sales_hier_rand.set_index("Group",inplace = True)
dollar_sales_group_rand = dollar_sales_hier_rand.groupby("Group").agg('sum')
dollar_sales_total_rand = dollar_sales_hier_rand.agg('sum')
dollar_sales_hier_rand = pd.concat([dollar_sales_total_rand,dollar_sales_group_rand,dollar_sales_hier_rand])

total_s = np.ones((1,70))
id_mat = np.eye(70)
group_sum_mat = pd.DataFrame(index = groups,columns = group_list)
group_sum_mat = group_sum_mat.fillna(0)
for g in groups:
  group_sum_mat.loc[group_sum_mat.index.isin([g]),group_sum_mat.columns.isin([g])] = 1

sum_matrix_rand =np.concatenate((total_s,id_mat,group_sum_mat))

train_ds = ListDataset([{"start":pd.to_datetime(dollar_sales_hier_rand.columns[0]),"target":dollar_sales_hier_rand}],freq='w',one_dim_target=False)

hier_estimator = DeepVARHierarchicalEstimator(freq='w',
                                              prediction_length=13,
                                              target_dim=dollar_sales_hier_rand.shape[1],
                                              S=sum_matrix_rand,
                                              trainer=Trainer(epochs=20),
                                              context_length=5,
                                              num_layers=5)

hier_predictor = hier_estimator.train(training_data=train_ds)

Error message or code output

When I train the estimator above, I receive the following error. I have been able to run DeepAREstimator and DeepVAREsimtor without issue. I thought this issue may have to do with my data inputs, but I have hit a dead end.

[/usr/local/lib/python3.7/dist-packages/mxnet/gluon/block.py](https://localhost:8080/#) in _build_cache(self, *args)
   1066 
   1067     def _build_cache(self, *args):
-> 1068         data, out = self._get_graph(*args)
   1069         data_names = {data.name: i for i, data in enumerate(data)}
   1070         input_names = out.list_inputs()

[/usr/local/lib/python3.7/dist-packages/mxnet/gluon/block.py](https://localhost:8080/#) in _get_graph(self, *args)
   1058             params = {i: j.var() for i, j in self._reg_params.items()}
   1059             with self.name_scope():
-> 1060                 out = self.hybrid_forward(symbol, *grouped_inputs, **params)  # pylint: disable=no-value-for-parameter
   1061             out, self._out_format = _flatten(out, "output")
   1062 

[/usr/local/lib/python3.7/dist-packages/gluonts/model/deepvar/_network.py](https://localhost:8080/#) in hybrid_forward(self, F, target_dimension_indicator, past_time_feat, past_target_cdf, past_observed_values, past_is_pad, future_time_feat, future_target_cdf, future_observed_values)
    835             future_time_feat,
    836             future_target_cdf,
--> 837             future_observed_values,
    838         )
    839 

[/usr/local/lib/python3.7/dist-packages/gluonts/model/deepvar/_network.py](https://localhost:8080/#) in train_hybrid_forward(self, F, target_dimension_indicator, past_time_feat, past_target_cdf, past_observed_values, past_is_pad, future_time_feat, future_target_cdf, future_observed_values)
    522         )
    523 
--> 524         loss = self.loss(F, target=target, distr=distr)
    525         assert_shape(loss, (-1, seq_len, 1))
    526 

[/usr/local/lib/python3.7/dist-packages/gluonts/model/deepvar_hierarchical/_network.py](https://localhost:8080/#) in loss(self, F, target, distr)
    256         # Samples shape: (num_samples, batch_size, seq_len, target_dim)
    257         if self.sample_LH or (self.CRPS_weight > 0.0):
--> 258             samples = self.get_samples_for_loss(distr=distr)
    259 
    260         if self.sample_LH:

[/usr/local/lib/python3.7/dist-packages/gluonts/model/deepvar_hierarchical/_network.py](https://localhost:8080/#) in get_samples_for_loss(self, distr)
    218                 reconciliation_mat=self.M,
    219                 samples=samples,
--> 220                 seq_axis=self.seq_axis,
    221             )
    222             assert_shape(coherent_samples, samples.shape)

[/usr/local/lib/python3.7/dist-packages/gluonts/model/deepvar_hierarchical/_network.py](https://localhost:8080/#) in reconcile_samples(reconciliation_mat, samples, seq_axis)
     63     """
     64     if not seq_axis:
---> 65         return mx.nd.dot(samples, reconciliation_mat, transpose_b=True)
     66     else:
     67         num_dims = len(samples.shape)

/usr/local/lib/python3.7/dist-packages/mxnet/ndarray/register.py in dot(lhs, rhs, transpose_a, transpose_b, forward_stype, out, name, **kwargs)

**AssertionError: Argument lhs must have NDArray type, but got <Symbol deepvarhierarchicaltrainingnetwork8_broadcast_mul0>**

Environment

(Add as much information about your environment as possible, e.g. dependencies versions.)

lostella commented 2 years ago

@rhameysayed in order to reproduce the issue, it would be nice if the snippet you provided were self-contained and runnable. I adjusted it a bit to include some missing imports, but there are some identifiers which are undefined: dollar_sales_hier and s_mat. Could you add the definition for those?

@rshyamsundar have you seen this kind of issue?

rhameysayed commented 2 years ago

@rhameysayed in order to reproduce the issue, it would be nice if the snippet you provided were self-contained and runnable. I adjusted it a bit to include some missing imports, but there are some identifiers which are undefined: dollar_sales_hier and s_mat. Could you add the definition for those?

@rshyamsundar have you seen this kind of issue?

@lostella I updated the code snippet with a reproducible example that gives me the same Assertion Error. dollar_sales_hier is a dataframe of sales numbers in $M, 's_mat' is the aggregation matrix 'S' discussed in section 2.1 of (https://www.amazon.science/publications/end-to-end-learning-of-coherent-probabilistic-forecasts-for-hierarchical-time-series) .

lostella commented 2 years ago

Thanks @rhameysayed now I can reproduce the error. One note: passing hybridize=False when constructing the Trainer object yields the following error

Traceback (most recent call last):
  File "issues/1967/reproduce.py", line 37, in <module>
    hier_predictor = hier_estimator.train(training_data=train_ds)
  File "/Users/stellalo/gluon-ts/src/gluonts/mx/model/estimator.py", line 235, in train
    cache_data=cache_data,
  File "/Users/stellalo/gluon-ts/src/gluonts/mx/model/estimator.py", line 207, in train_model
    validation_iter=validation_data_loader,
  File "/Users/stellalo/gluon-ts/src/gluonts/mx/trainer/_base.py", line 446, in __call__
    num_batches_to_use=self.num_batches_per_epoch,
  File "/Users/stellalo/gluon-ts/src/gluonts/mx/trainer/_base.py", line 338, in loop
    _ = net(*batch.values())
  File "/Users/stellalo/.virtualenvs/gluonts/lib/python3.7/site-packages/mxnet/gluon/block.py", line 682, in __call__
    out = self.forward(*args)
  File "/Users/stellalo/.virtualenvs/gluonts/lib/python3.7/site-packages/mxnet/gluon/block.py", line 1345, in forward
    return self.hybrid_forward(ndarray, x, *args, **params)
  File "/Users/stellalo/gluon-ts/src/gluonts/model/deepvar/_network.py", line 833, in hybrid_forward
    future_observed_values,
  File "/Users/stellalo/gluon-ts/src/gluonts/model/deepvar/_network.py", line 497, in train_hybrid_forward
    target_dimension_indicator=target_dimension_indicator,
  File "/Users/stellalo/gluon-ts/src/gluonts/model/deepvar/_network.py", line 363, in unroll_encoder
    begin_state=None,
  File "/Users/stellalo/gluon-ts/src/gluonts/model/deepvar/_network.py", line 215, in unroll
    (-1, unroll_length, self.target_dim, len(self.lags_seq)),
  File "/Users/stellalo/gluon-ts/src/gluonts/mx/util.py", line 85, in assert_shape
    ), f"shape mismatch got {x.shape} expected {expected_shape}"
AssertionError: shape mismatch got (32, 18, 404, 1) expected (-1, 18, 325, 1)

so there appears to be at least a shape issue.

esbraun commented 2 years ago

I'm having the same error thrown as well with my own dataset - thank you for looking into this issue!

rshyamsundar commented 2 years ago

@rhameysayed your S matrix has the right shape (81 x 70) but the actual target has wrong shape.

dollar_sales_hier_rand has shape 404 x 325. This should be a two dimensional array of shape:81 x 324.

The way you are constructing this dataframe is wrong. Your total time series dollar_sales_total_rand has date as the row index while for other two dataframes, columns are indexed by date! Transpose the total series after converting it to a dataframe:

dollar_sales_hier_rand = pd.concat([dollar_sales_total_rand.to_frame().T,dollar_sales_group_rand,dollar_sales_hier_rand])

Also, you are passing wrong value for target_dim while creating the estimator: target_dim is the number of rows of target not columns:

target_dim=dollar_sales_hier_rand.shape[0],

As suggested by @lostella pass hybridize=False to the trainer argument; it does not work in symbolic mode:

trainer=Trainer(epochs=20, hybridize=False),

I am able to run the training job with these changes in your code!

rshyamsundar commented 2 years ago

I am closing this since it is a data preparation issue. I am attaching the working code here for reference.

import pandas as pd
import numpy as np
from gluonts.dataset.common import ListDataset
from gluonts.model.deepvar_hierarchical import DeepVARHierarchicalEstimator
from gluonts.mx.trainer import Trainer

groups = list('ABCDEFGHIJ')
group_list = list('ABCDEFGHIJ'*7)
time_line = pd.date_range("2016-01-10",freq= 'w',periods = 324)
dollar_sales_hier_rand = pd.DataFrame(np.random.randint(0,300,size=(70, 324)))
dollar_sales_hier_rand.columns = time_line
dollar_sales_hier_rand["Group"] = group_list
dollar_sales_hier_rand.set_index("Group",inplace = True)
dollar_sales_group_rand = dollar_sales_hier_rand.groupby("Group").agg('sum')
dollar_sales_total_rand = dollar_sales_hier_rand.agg('sum')
dollar_sales_hier_rand = pd.concat([dollar_sales_total_rand.to_frame().T,dollar_sales_group_rand,dollar_sales_hier_rand])

total_s = np.ones((1,70))
id_mat = np.eye(70)
group_sum_mat = pd.DataFrame(index = groups,columns = group_list)
group_sum_mat = group_sum_mat.fillna(0)
for g in groups:
  group_sum_mat.loc[group_sum_mat.index.isin([g]),group_sum_mat.columns.isin([g])] = 1

sum_matrix_rand =np.concatenate((total_s,id_mat,group_sum_mat))

train_ds = ListDataset([{"start":pd.to_datetime(dollar_sales_hier_rand.columns[0]),"target":dollar_sales_hier_rand}],freq='w',one_dim_target=False)

hier_estimator = DeepVARHierarchicalEstimator(freq='w',
                                              prediction_length=13,
                                              target_dim=dollar_sales_hier_rand.shape[0],
                                              S=sum_matrix_rand,
                                              trainer=Trainer(epochs=20, hybridize=False),
                                              context_length=5,
                                              num_layers=5)

hier_predictor = hier_estimator.train(training_data=train_ds)
rshyamsundar commented 2 years ago

For future reference, adding here the link to the tutorial that explains how to use the hierarchical model: https://ts.gluon.ai/dev/tutorials/forecasting/hierarchical_model_tutorial.html