facebookresearch / SymbolicMathematics

Deep Learning for Symbolic Mathematics
Other
527 stars 116 forks source link

About the attn_mask #19

Open mingliangzhang2018 opened 3 years ago

mingliangzhang2018 commented 3 years ago

Excuse me, could you tell me that the attn_mask for Masked Multi-Head Attention only use sequence mask but ignore the padding mask? To my konwledge, padding mask is necessay for eliminating the effect of padding. image