Flex attention + refactor

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

https://huggingface.co/transformers

Apache License 2.0

135.33k stars 27.09k forks source link

Open ArthurZucker opened 3 days ago

ArthurZucker commented 3 days ago

Opening this to add support for all models following #34282

Lets bring support for flex attention to more models! 🤗

It would be great to add the support for more architectures such as

... and many more

For anyone who wants to contribute just open a PR and link it to this issue, and ping me for a review!! 🤗

dame-cell commented 3 days ago

@ArthurZucker I'll try to open a pr for some architecture maybe llama and gemma

farrosalferro commented 3 days ago

I will do the Clip and Llama

OmarManzoor commented 2 days ago

@ArthurZucker I would like to try out Mistral

mayankagarwals commented 2 days ago

Picking this up for gpt2 and moshi!

dame-cell commented 11 hours ago

Picking this up for gpt2 and moshi

@mayankagarwals hey there since moshi actually copies some code form Gemma is it ok if I handle it