[ENH] implement LSTM model, and migrate `LSTMmodel` from notebook to main code base

svnv-svsv-jm commented 5 months ago

Just create this dataset:

import numpy as np
import pandas as pd

multi_target_test_data = pd.DataFrame(
    dict(
        target1=np.random.rand(30),
        target2=np.random.rand(30),
        group=np.repeat(np.arange(3), 10),
        time_idx=np.tile(np.arange(10), 3),
    )
)

from pytorch_forecasting import TimeSeriesDataSet
from pytorch_forecasting.data.encoders import EncoderNormalizer, MultiNormalizer, TorchNormalizer

# create the dataset from the pandas dataframe
dataset = TimeSeriesDataSet(
    multi_target_test_data,
    group_ids=["group"],
    target=["target1", "target2"],  # USING two targets
    time_idx="time_idx",
    min_encoder_length=5,
    max_encoder_length=5,
    min_prediction_length=2,
    max_prediction_length=2,
    time_varying_unknown_reals=["target1", "target2"],
    target_normalizer=MultiNormalizer(
        [EncoderNormalizer(), TorchNormalizer()]
    ),  # Use the NaNLabelEncoder to encode categorical target
)

And input it to the current LSTMModel in the tutorials:

model = LSTMModel.from_dataset(
    dataset,
    n_layers=2,
    hidden_size=10,
    loss=MultiLoss([MAE() for _ in range(2)]),
)

x, y = next(iter(dataset.to_dataloader()))

print(
    "prediction shape in training:", model(x)["prediction"].size()
)  # batch_size x decoder time steps x 1 (1 for one target dimension)
model.eval()  # set model into eval mode to use autoregressive prediction
print("prediction shape in inference:", model(x)["prediction"].size())  # should be the same as in training

And you'll get:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-13-037e20e20342> in <cell line: 3>()
      2 
      3 print(
----> 4     "prediction shape in training:", model(x)["prediction"].size()
      5 )  # batch_size x decoder time steps x 1 (1 for one target dimension)
      6 model.eval()  # set model into eval mode to use autoregressive prediction

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)
   1530             return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1531         else:
-> 1532             return self._call_impl(*args, **kwargs)
   1533 
   1534     def _call_impl(self, *args, **kwargs):

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1539                 or _global_backward_pre_hooks or _global_backward_hooks
   1540                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1541             return forward_call(*args, **kwargs)
   1542 
   1543         try:

<ipython-input-11-af0ffbd16c05> in forward(self, x)
    105 
    106     def forward(self, x: Dict[str, torch.Tensor]) -> Dict[str, torch.Tensor]:
--> 107         hidden_state = self.encode(x)  # encode to hidden state
    108         output = self.decode(x, hidden_state)  # decode leveraging hidden state
    109 

<ipython-input-11-af0ffbd16c05> in encode(self, x)
     51         effective_encoder_lengths = x["encoder_lengths"] - 1
     52         # run through LSTM network
---> 53         _, hidden_state = self.lstm(
     54             input_vector, lengths=effective_encoder_lengths, enforce_sorted=False  # passing the lengths directly
     55         )  # second ouput is not needed (hidden state)

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)
   1530             return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1531         else:
-> 1532             return self._call_impl(*args, **kwargs)
   1533 
   1534     def _call_impl(self, *args, **kwargs):

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1539                 or _global_backward_pre_hooks or _global_backward_hooks
   1540                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1541             return forward_call(*args, **kwargs)
   1542 
   1543         try:

/usr/local/lib/python3.10/dist-packages/pytorch_forecasting/models/nn/rnn.py in forward(self, x, hx, lengths, enforce_sorted)
    105             else:
    106                 pack_lengths = lengths.where(lengths > 0, torch.ones_like(lengths))
--> 107                 packed_out, hidden_state = super().forward(
    108                     rnn.pack_padded_sequence(
    109                         x, pack_lengths.cpu(), enforce_sorted=enforce_sorted, batch_first=self.batch_first

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/rnn.py in forward(self, input, hx)
    912                               self.dropout, self.training, self.bidirectional, self.batch_first)
    913         else:
--> 914             result = _VF.lstm(input, batch_sizes, hx, self._flat_weights, self.bias,
    915                               self.num_layers, self.dropout, self.training, self.bidirectional)
    916         output = result[0]

RuntimeError: mat1 and mat2 shapes cannot be multiplied (48x2 and 1x40)

It's just weird that there is still no fix for this, and no LSTM model out-of-the-box. I even made a fix, there is a PR.

Why does no one care about fixing this?

It is totally obscure how pytorch_forecasting handles uni-/multi-targets, I've also noticed that if you pass target=["target"] to TimeSeriesDataSet, the TimeSeriesDataSet behaves very differently w.r.t. if you passed target="target".

Please just review that PR and even merge it, or fix it...

benHeid commented 2 months ago

Thank you for pointing on this.

The exemplary LSTMModel in the notebook is not capable of doing multivariate forecasting. Currently, it seems only be able to do univariate time series forecasting.

Thus, I see the necessity to clarify this in the notebook. However, I suppose it is not a bug.

Can you kindly link the Pull Request to which you are referring?

fkiraly commented 2 months ago

Would tags for the objects be useful here? E.g., capability:multivariate I have opened an issue to explain: https://github.com/sktime/pytorch-forecasting/issues/1679

fkiraly commented 2 months ago

Other question: if the model is interfaced in sktime, it would be able to do multivariate forecasting through broadcasting, although that may be more wasteful than the native capability.

benHeid commented 2 months ago

Would tags for the objects be useful here? E.g., capability:multivariate

I have opened an issue to explain: https://github.com/sktime/pytorch-forecasting/issues/1679

In this case not sure since this model is implemented in the notebook. Probably to show how to do this with PyTorch-forecasting.

benHeid commented 2 months ago

Other question: if the model is interfaced in sktime, it would be able to do multivariate forecasting through broadcasting, although that may be more wasteful than the native capability.

Yes it would enable multivariate forecast but the underlying model is still univariate. So enabling multivariate here would lead to a different forecast than broadcasting.

fkiraly commented 2 months ago

In this case not sure since this model is implemented in the notebook. Probably to show how to do this with PyTorch-forecasting.

That is weird. Why would one implement it in the notebook? I think this should move to the main code base.

I have not seen the notebook, is this some kind of demo? Either way, LSTM are an important class of models, even if not the freshest one, so it should go into the main code base.

svnv-svsv-jm commented 1 month ago

Thank you for pointing on this.

The exemplary LSTMModel in the notebook is not capable of doing multivariate forecasting. Currently, it seems only be able to do univariate time series forecasting.

Thus, I see the necessity to clarify this in the notebook. However, I suppose it is not a bug.

Can you kindly link the Pull Request to which you are referring?

The PR is in the description, isn't it?

I agree that a built-in pytorch_forecasting multi-variate LSTM model class is what we need. I think it just fits the library to have a such a basic, popular model as built-in one.

fkiraly commented 1 month ago

@svnv-svsv-jm, of course, you are right!

This is the PR, right? https://github.com/sktime/pytorch-forecasting/pull/1449 Nice!

Unfortunately, there are now some conflicts due to the recent upgrade and releases of pytorch-forecasting for python 3.11, 3.12. (we had to prioritize the general maintenance above the historical PR)

The clashes look mostly like code formatting related - it would be greatly appreciated if you went through these and resolve any conflicts with main - meanwhile, we'll be working on 3.13 support.

It might also be easier to review and merge if you split the PR in two parts - "tune anything" and the LSTM network - but if you feel that's too much of a hassle, I will not consider that a blocker.

svnv-svsv-jm commented 1 month ago

Alright I rebased it, but I haven't had the time to run all the tests, so let's see what the CI/CD says.

I'm note sure I'll have time today to split it... :/

sktime / pytorch-forecasting

[ENH] implement LSTM model, and migrate `LSTMmodel` from notebook to main code base #1582