Open tcapelle opened 4 years ago
hi,
Can you give me some detail about "Have not been able to get reasonable
results with the Attention layers".
I'll try what you said, look for the dataset validation effect.Thank
you for your feedback.
best, zhou
Thomas Capelle notifications@github.com 于2020年6月19日周五 下午9:29写道:
Have been playing with your implementation of RNN2RNN mixing it up with this one https://github.com/fastai/course-nlp/blob/master/7b-seq2seq-attention-translation.ipynb . I have some remarks that have worked for me:
- Inverse the Dense layer dropput/linear order. Lin(drop(x))
- Apply the Dense layer to the hidden state of the encoder before sending it to the decoder.
- Have not been able to get reasonable results with the Attention layers, maybe I am doing something wrong.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/EvilPsyCHo/Deep-Time-Series-Prediction/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEJXRCUCMY6SIRSDDJZ4BT3RXNR2HANCNFSM4OCXLGDA .
Actually, my model get's worse. But don't really know what params have to give to the attention layers, how many heads is reasonable.
Can you send the train & test dataset to me ? I will check it.
Thomas Capelle notifications@github.com 于2020年6月22日周一 下午4:26写道:
Actually, my model get's worse. But don't really know what params have to give to the attention layers, how many heads is reasonable.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/EvilPsyCHo/Deep-Time-Series-Prediction/issues/2#issuecomment-647366458, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEJXRCQKP3QMOTKR3PE3UN3RX4IR5ANCNFSM4OCXLGDA .
Not really, I am using sequences of images encoded by a CNN to generate a multi-variable timeseries.
Have been playing with your implementation of RNN2RNN mixing it up with this one. I have some remarks that have worked for me:
Dense
layer dropput/linear order.Lin(drop(x))
Dense
layer to the hidden state of the encoder before sending it to the decoder.