ofirpress / attention_with_linear_biases

Code for the ALiBi method for transformer language models (ICLR 2022)
MIT License
497 stars 38 forks source link

Integration with `transformers` #6

Closed sayakpaul closed 2 years ago

sayakpaul commented 2 years ago

Amazing work! I'm sure it will open up doors for researchers to think about ways to better extrapolate during inference time.

I am interested to know if you know of any integrations that use AliBi with transformers from Hugging Face.

ofirpress commented 2 years ago

Thanks!

I'm not aware of any correct integrations with the transformers library from HF, but when the BigScience large language model will complete training, it will be available through huggingface and since that model uses ALiBi that will require the integration of ALiBi into the transformers library.

You can read more about it here: https://bigscience.notion.site/BigScience-176B-Model-ad073ca07cdf479398d5f95d88e218c4