Closed beebrain closed 4 years ago
The forty-fourth line should be energy = hidden.view(-1).dot(energy.view(-1))
@xiaolan98 Thank you, I fixed with mm operation. My code is "energy = hidden.mm(ennergy.transpose(0,1)". Is it ok? I think that is a summation of the product.
I can't calculate the score of the attention that provided the error like this.
The implementation tries to calculate the attention score step by step but why do not they use dot operation instance mm or bmm operation. The shape of hidden is [1,256] and energy is [1,256] We need transpose hidden and multiple with energy. If I wrong, please suggest me to fix that.