I can't calculate the score of attention in Seq2Seq Translation.

spro / practical-pytorch

Go to https://github.com/pytorch/tutorials - this repo is deprecated and no longer maintained

MIT License

4.52k stars 1.11k forks source link

I can't calculate the score of attention in Seq2Seq Translation. #135

Closed beebrain closed 4 years ago

beebrain commented 5 years ago

I can't calculate the score of the attention that provided the error like this.

The implementation tries to calculate the attention score step by step but why do not they use dot operation instance mm or bmm operation. The shape of hidden is [1,256] and energy is [1,256] We need transpose hidden and multiple with energy. If I wrong, please suggest me to fix that.

xiaolan98 commented 5 years ago

The forty-fourth line should be energy = hidden.view(-1).dot(energy.view(-1))

beebrain commented 5 years ago

@xiaolan98 Thank you, I fixed with mm operation. My code is "energy = hidden.mm(ennergy.transpose(0,1)". Is it ok? I think that is a summation of the product.