Closed LutherLin closed 2 months ago
@LutherLin yup of course
in this repository, it is simply mask
for the key padding mask, the shape accepted is (batch, seq)
, and True
denotes attend, False
not attend
in your example
padding_mask # (batch, seq) - (50, 32)
xseq # (batch, seq, feature dimension) - (50, 32, 384)
all my repositories adopt batch-first
In PyTorch's official
nn.TransformerEncoder
, there is a parameter calledsrc_key_padding_mask
, which represents the mask for source data keys in each batch (optional). Does thex_transformers
library offer a similar optional masking method, specifically designed to mask only the keys?I have defined the network structure above,then I want to use as:
Which mask should I use?