jdb78 / pytorch-forecasting

Time series forecasting with PyTorch
https://pytorch-forecasting.readthedocs.io/
MIT License
3.84k stars 608 forks source link

Potential BUG: TFT/LSTM network parameters is not related to max_encoder_length/max_prediction_length #536

Open DAT-FYAYC opened 3 years ago

DAT-FYAYC commented 3 years ago

The number of network parameters for TFT and LSTM (maybe other models as well) does not change when changing either max_encoder_length and/or max_prediction_length in the TimeSeriesDataSet class.

For the calculation of the number of network parameters I used the implementation of PyTorch Forecasting and verified it with the standard pytorch implementation to calculate the number of network parameters.

For TFT (min_encoder_length = max_encoder_length; min_prediction_length = max_prediction_length):

To my understanding the architecture und number of blocks (Variable Selection, LSTM encoder, Gates) in the encoder is changing linearly with the number of time steps used as past input. Since this is per default (TFT.max_encoder_length = TimeSeriesDataSet.max_encoder_length) set with TimeSeriesDataSet.max_encoder_length, an increase/decrease of TimeSeriesDataSet.max_encoder_length (max_prediction_length) should result in a change of the number of network parameters. Since this is not the case for TFT and LSTM, I wonder if there is a bug in the implementation.

Is there a reason when setting TFT.max_encoder_length > TimeSeriesDataSet.max_encoder_length? Since something like context_length in NBeats or GluonTS DeepAR does not exist for TFT and is not meant to be, is it?

@jdb78 Can you please comment on this issue?

Thanks a lot! David

jdb78 commented 3 years ago

The TFT is a recurrent network, i.e. the number of parameters does not change with an increasing number of time steps. The parameter max_encoder_length is only used for interpretation purposes.