sktime / pytorch-forecasting

Time series forecasting with PyTorch
https://pytorch-forecasting.readthedocs.io/
MIT License
3.88k stars 614 forks source link

[DOC] Clarify that the `LSTMModel` in buildings.ipynb is univariate or make it multivariate #1582

Open svnv-svsv-jm opened 3 months ago

svnv-svsv-jm commented 3 months ago

Just create this dataset:

import numpy as np
import pandas as pd

multi_target_test_data = pd.DataFrame(
    dict(
        target1=np.random.rand(30),
        target2=np.random.rand(30),
        group=np.repeat(np.arange(3), 10),
        time_idx=np.tile(np.arange(10), 3),
    )
)

from pytorch_forecasting import TimeSeriesDataSet
from pytorch_forecasting.data.encoders import EncoderNormalizer, MultiNormalizer, TorchNormalizer

# create the dataset from the pandas dataframe
dataset = TimeSeriesDataSet(
    multi_target_test_data,
    group_ids=["group"],
    target=["target1", "target2"],  # USING two targets
    time_idx="time_idx",
    min_encoder_length=5,
    max_encoder_length=5,
    min_prediction_length=2,
    max_prediction_length=2,
    time_varying_unknown_reals=["target1", "target2"],
    target_normalizer=MultiNormalizer(
        [EncoderNormalizer(), TorchNormalizer()]
    ),  # Use the NaNLabelEncoder to encode categorical target
)

And input it to the current LSTMModel in the tutorials:

model = LSTMModel.from_dataset(
    dataset,
    n_layers=2,
    hidden_size=10,
    loss=MultiLoss([MAE() for _ in range(2)]),
)

x, y = next(iter(dataset.to_dataloader()))

print(
    "prediction shape in training:", model(x)["prediction"].size()
)  # batch_size x decoder time steps x 1 (1 for one target dimension)
model.eval()  # set model into eval mode to use autoregressive prediction
print("prediction shape in inference:", model(x)["prediction"].size())  # should be the same as in training

And you'll get:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-13-037e20e20342> in <cell line: 3>()
      2 
      3 print(
----> 4     "prediction shape in training:", model(x)["prediction"].size()
      5 )  # batch_size x decoder time steps x 1 (1 for one target dimension)
      6 model.eval()  # set model into eval mode to use autoregressive prediction

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)
   1530             return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1531         else:
-> 1532             return self._call_impl(*args, **kwargs)
   1533 
   1534     def _call_impl(self, *args, **kwargs):

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1539                 or _global_backward_pre_hooks or _global_backward_hooks
   1540                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1541             return forward_call(*args, **kwargs)
   1542 
   1543         try:

<ipython-input-11-af0ffbd16c05> in forward(self, x)
    105 
    106     def forward(self, x: Dict[str, torch.Tensor]) -> Dict[str, torch.Tensor]:
--> 107         hidden_state = self.encode(x)  # encode to hidden state
    108         output = self.decode(x, hidden_state)  # decode leveraging hidden state
    109 

<ipython-input-11-af0ffbd16c05> in encode(self, x)
     51         effective_encoder_lengths = x["encoder_lengths"] - 1
     52         # run through LSTM network
---> 53         _, hidden_state = self.lstm(
     54             input_vector, lengths=effective_encoder_lengths, enforce_sorted=False  # passing the lengths directly
     55         )  # second ouput is not needed (hidden state)

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)
   1530             return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1531         else:
-> 1532             return self._call_impl(*args, **kwargs)
   1533 
   1534     def _call_impl(self, *args, **kwargs):

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1539                 or _global_backward_pre_hooks or _global_backward_hooks
   1540                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1541             return forward_call(*args, **kwargs)
   1542 
   1543         try:

/usr/local/lib/python3.10/dist-packages/pytorch_forecasting/models/nn/rnn.py in forward(self, x, hx, lengths, enforce_sorted)
    105             else:
    106                 pack_lengths = lengths.where(lengths > 0, torch.ones_like(lengths))
--> 107                 packed_out, hidden_state = super().forward(
    108                     rnn.pack_padded_sequence(
    109                         x, pack_lengths.cpu(), enforce_sorted=enforce_sorted, batch_first=self.batch_first

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/rnn.py in forward(self, input, hx)
    912                               self.dropout, self.training, self.bidirectional, self.batch_first)
    913         else:
--> 914             result = _VF.lstm(input, batch_sizes, hx, self._flat_weights, self.bias,
    915                               self.num_layers, self.dropout, self.training, self.bidirectional)
    916         output = result[0]

RuntimeError: mat1 and mat2 shapes cannot be multiplied (48x2 and 1x40)

It's just weird that there is still no fix for this, and no LSTM model out-of-the-box. I even made a fix, there is a PR.

Why does no one care about fixing this?

It is totally obscure how pytorch_forecasting handles uni-/multi-targets, I've also noticed that if you pass target=["target"] to TimeSeriesDataSet, the TimeSeriesDataSet behaves very differently w.r.t. if you passed target="target".

Please just review that PR and even merge it, or fix it...

benHeid commented 2 days ago

Thank you for pointing on this.

The exemplary LSTMModel in the notebook is not capable of doing multivariate forecasting. Currently, it seems only be able to do univariate time series forecasting.

Thus, I see the necessity to clarify this in the notebook. However, I suppose it is not a bug.

Can you kindly link the Pull Request to which you are referring?

fkiraly commented 2 days ago

Would tags for the objects be useful here? E.g., capability:multivariate I have opened an issue to explain: https://github.com/sktime/pytorch-forecasting/issues/1679

fkiraly commented 2 days ago

Other question: if the model is interfaced in sktime, it would be able to do multivariate forecasting through broadcasting, although that may be more wasteful than the native capability.

benHeid commented 2 days ago

Would tags for the objects be useful here? E.g., capability:multivariate

I have opened an issue to explain: https://github.com/sktime/pytorch-forecasting/issues/1679

In this case not sure since this model is implemented in the notebook. Probably to show how to do this with PyTorch-forecasting.

benHeid commented 2 days ago

Other question: if the model is interfaced in sktime, it would be able to do multivariate forecasting through broadcasting, although that may be more wasteful than the native capability.

Yes it would enable multivariate forecast but the underlying model is still univariate. So enabling multivariate here would lead to a different forecast than broadcasting.

fkiraly commented 1 day ago

In this case not sure since this model is implemented in the notebook. Probably to show how to do this with PyTorch-forecasting.

That is weird. Why would one implement it in the notebook? I think this should move to the main code base.

I have not seen the notebook, is this some kind of demo? Either way, LSTM are an important class of models, even if not the freshest one, so it should go into the main code base.