fchollet / deep-learning-with-python-notebooks

Jupyter notebooks for the code samples of the book "Deep Learning with Python"
MIT License
18.59k stars 8.63k forks source link

Mask for `TransformerDecoder` in the end-to-end Transformer (chapter11_part04_sequence-to-sequence-learning.ipynb) #209

Open balvisio opened 2 years ago

balvisio commented 2 years ago

In the chapter11_part04_sequence-to-sequence-learning.ipynb, the TransformerDecoder receives the mask from the PositionalEmbedding layer of the target sequence:

x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)
x = TransformerDecoder(embed_dim, dense_dim, num_heads)(x, encoder_outputs)

Shouldn’t the mask be the one created from encoding the source sequence?

For example, I have seen that in this TF tutorial the mask from the source sequence is used instead.

Any clarification would be greatly appreciated.