Seanny123 / da-rnn

Dual-Stage Attention-Based Recurrent Neural Net for Time Series Prediction
331 stars 120 forks source link

Dose this model genelarize well on your (other) dataset? #20

Closed tanqi-github closed 4 years ago

tanqi-github commented 4 years ago

Thank you very much for the implementation. And I wonder whether there is someone applying this method on other datasets and how's the performance?

When I apply this method on my datasets (a traffic dataset and a disease dataset), there are two problems: 1. the loss is very big, i.e., the model cannot learn the pattern, much worse than the vanilla LSTM, wired. 2. in some cases, the val loss drops quickly, but increases explosively (in 1-2 epochs).

I have tried to use the minmax scaler and the gradient norm clip to address the problem, but these don't work. As the encoder use the whole sequence for attention, the T cannot be too big, that limits the information inputed to the model. But I still think it is hard to tune this model in other datasets. Does someone have similar experience?

dan-r95 commented 4 years ago

Did you have success with your problem?