dimensions of encoder_attention_mask in inputs of OFADecoder

OFA-Sys / OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Apache License 2.0

2.41k stars 248 forks source link

dimensions of encoder_attention_mask in inputs of OFADecoder #384

Closed SilyRab closed 1 year ago

SilyRab commented 1 year ago

This annotation may be wrong： encoder_attention_mask (torch.Tensorof shape(bsz, seq_len)): the padding mask of the source side. Does it refer to: shape (bsz, 1, tgt_len, src_len)

logicwong commented 1 year ago

@SilyRab Do you mean encoder_padding_mask? If so, its shape is equal to (bsz, seq_len), as shown in here