Kyubyong / transformer

A TensorFlow Implementation of the Transformer: Attention Is All You Need
Apache License 2.0
4.28k stars 1.3k forks source link

The reason why removed keys masking and query masking? #155

Open user683 opened 4 years ago

user683 commented 4 years ago

Is there a no influence before do this?

GuoshenLi commented 3 years ago

the author did not remove the key masking but the query masking, no influence to the result since it will mask in the final output in the loss calculation