Closed bxcxa closed 3 years ago
This module is based on NAML paper.
class AdditiveAttention(torch.nn.Module):
"""
A general additive attention module.
Originally for NAML.
"""
It accepts a list of candidates vectors and outputs a weighted combination of the vectors (i.e. $ s = \alpha1 s_1 + \alpha2 s_2 ... \alpha n s_n $). In other words, input shape: l, d
, output shape: d
.
As for the alignment scores computed by the formula you posted, I believe they are the same things with \alpha1, \alpha2, \alpha n. Just the weights for linear combination.
好家伙,以为是个歪果仁,没想到是校友😂
because i find in other place that the additive attention is use to score the alignment of two word ,but in this,it just accept one vector ,and then sent to forward network.could you explain it specifically ?