keon / seq2seq

Minimal Seq2Seq model with Attention for Neural Machine Translation in PyTorch
MIT License
689 stars 172 forks source link

in model.py line 76: context = attn_weights.bmm(encoder_outputs.transpose(0, 1)) # (B,1,N) #16

Closed Huijun-Cui closed 3 years ago

Huijun-Cui commented 5 years ago

Is this correct? I look up the explanation about bmm in official document,it syas that batch1 and batch2 must be 3-D tensors each containing the same number of matrices. but as you defined before, the attn_weights is a 2-D shape ,I think here may be some mistakes

Hahallo commented 3 years ago

return F.softmax(attn_energies, dim=1).unsqueeze(1) This line of code makes attn_weights 3-D shape