Hi, I found an interesting phenomenon in the process of using your MSA Transformer model.
In the LearnedPositionalEmbedding class, you added padding_idx to positions (kind of like offset). I wonder why padding_idx is added and is this addition necessary? Thank you!
Hi, I found an interesting phenomenon in the process of using your MSA Transformer model. In the LearnedPositionalEmbedding class, you added
padding_idx
topositions
(kind of like offset). I wonder why padding_idx is added and is this addition necessary? Thank you!