Closed Junjieli0704 closed 6 years ago
Thanks for your interest in this repo!
In my current implementation all source sentences in a batch have the same length, so there is no need to apply masking on source sentences :) Nevertheless I implemented the dot_prod_attention
function in a rather general fashion to allow for source masking, so you can easily modify the codebase to have source sentences of different lengths in a batch.
Sorry for the confusion!
Thank you very much!
Hi, this is a nice repository to learn nmt and seq2seq models! I have a problem about the mask problem in computing attention values.
The function in nmt.py to compute attention values is: _def dot_prod_attention(self, h_t, src_encoding, src_encoding_attlinear, mask=None)
When you call _dot_prodattention to compute attention values, you always use the default mask value(None). Doesn't it need to set mask value when computing attention values?
Thank you !