Open ArthurZucker opened 3 days ago
@ArthurZucker I'll try to open a pr for some architecture maybe llama and gemma
I will do the Clip and Llama
@ArthurZucker I would like to try out Mistral
Picking this up for gpt2 and moshi!
Picking this up for gpt2 and moshi
@mayankagarwals hey there since moshi actually copies some code form Gemma is it ok if I handle it
Opening this to add support for all models following #34282
Lets bring support for flex attention to more models! 🤗
It would be great to add the support for more architectures such as
... and many more
For anyone who wants to contribute just open a PR and link it to this issue, and ping me for a review!! 🤗