Flash Attention Support

kadirnar / whisper-plus

WhisperPlus: Faster, Smarter, and More Capable 🚀

Apache License 2.0

1.73k stars 137 forks source link

Closed amrrs closed 7 months ago

amrrs commented 7 months ago

Hey! Great work.

I think the latest code change for the SpeechToTextPipeline expects all GPUs to be Flash Attention 2compatible.

I'm not sure if there's anyway to override the kwargs.

I used it on P100 from Kaggle and got the error about Flash Attention

kadirnar commented 7 months ago

Can you share the error message?

kadirnar commented 7 months ago

I added flash-attention2 as a parameter for you to turn off. You can look at Readme.