Closed Otsutsukii closed 2 weeks ago
Hi! I experimented with this and it seems the architecture does not allow that unless we rework more parts of the model. To improve accuracy, you can always experiment with different activation functions or number of attention heads.
Description
Hi could you add a number of layers parameters in the TFT's LSTM so we can do stacking of LSTM?
Like num_layers=1 by defaults?
Use case
I am trying to get a better accuracy for the TFT, by stacking more LSTM layers in the TFT TemporalCovariateEncoder.