ofirpress / attention_with_linear_biases

Code for the ALiBi method for transformer language models (ICLR 2022)
MIT License
497 stars 38 forks source link

Imeplementation about ALibi #19

Closed DreamShibei closed 7 months ago

DreamShibei commented 7 months ago

Hi! Thanks for your excellent work! It is described in the paper that you subtract a relative position embedding on the attention matrix. However, in the code, self.alibi is added to the attention matrix, what is the reason to do so?

DreamShibei commented 7 months ago

I found results here! https://nn.labml.ai/transformers/alibi/index.html