maxjcohen / transformer

Implementation of Transformer model (originally from Attention is All You Need) applied to Time Series.
https://timeseriestransformer.readthedocs.io/en/latest/
GNU General Public License v3.0
842 stars 165 forks source link

Accuracy vs LSTM #22

Closed mahdiabdollahpour closed 3 years ago

mahdiabdollahpour commented 3 years ago

Have you compared the results of Transformer vs LSTM in time series prediction?

maxjcohen commented 3 years ago

Hi, see here for a rough comparison (Table 1). In practice, RNN based models are faster when computing on longer sequences.

mahdiabdollahpour commented 3 years ago

Hi, see here for a rough comparison (Table 1). In practice, RNN based models are faster when computing on longer sequences.

Thanks for the link to the paper. Did you do hyperparameter tuning for LSTM, BiGRU benchmarks, too? ( similar to grid search in table 7)

maxjcohen commented 3 years ago

No, we could not tune the HP for any model other than the Transformer, but they seem to be very promising. You should be able to get similar of better result from the BiGRU and ConvGru in particular, if you spend a little time tuning them.

I did not get the time to benchmark with a fully convolutional network (similar to wavenet approaches), but I believe it would yield very good results too, for a limited computation cost.