gabrielloye / Attention_Seq2seq-Translation

34 stars 23 forks source link

discrepancy with original Badahnau paper on the self.weight parameter #5

Open xiaolongwu0713 opened 3 years ago

xiaolongwu0713 commented 3 years ago

Hi, I notice you calculate the alignment score by bmm a nn.parameter with tanh(encoder_output, decoder_hidden_state). However in the original paper, there is no need to bmm this extra nn.parameter. It says: image

so is there any reason for the multiplying?