gabrielloye / Attention_Seq2seq-Translation

34 stars 23 forks source link

question about discrepancy with Pytorch NLP seq2seq tutorial #4

Open xiaolongwu0713 opened 3 years ago

xiaolongwu0713 commented 3 years ago

I am studying the attention recent. I have some doubt about they calculate the attention is Pytorch NLP attention tutorial: https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html.

In the tutorial, they calculate the score or weight using the decoder’s input and decoder's hidden state. However I find out neither Luong or Badahnau do that why. Instead both use the decoder hidden state and the ENCODER output the calculate the weight. Why Pytorch tutorial do that way? Is yours or Pytorch are the right way?