In the chapter11_part04_sequence-to-sequence-learning.ipynb, the TransformerDecoder receives the mask from the PositionalEmbedding layer of the target sequence:
x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)
x = TransformerDecoder(embed_dim, dense_dim, num_heads)(x, encoder_outputs)
Shouldn’t the mask be the one created from encoding the source sequence?
For example, I have seen that in this TF tutorial the mask from the source sequence is used instead.
In the
chapter11_part04_sequence-to-sequence-learning.ipynb
, the TransformerDecoder receives the mask from the PositionalEmbedding layer of the target sequence:Shouldn’t the mask be the one created from encoding the source sequence?
For example, I have seen that in this TF tutorial the mask from the source sequence is used instead.
Any clarification would be greatly appreciated.