kadirnar / whisper-plus

WhisperPlus: Faster, Smarter, and More Capable 🚀
Apache License 2.0
1.67k stars 133 forks source link

Flash Attention Support #83

Closed amrrs closed 4 months ago

amrrs commented 4 months ago

Hey! Great work.

I think the latest code change for the SpeechToTextPipeline expects all GPUs to be Flash Attention 2compatible.

I'm not sure if there's anyway to override the kwargs.

https://github.com/kadirnar/whisper-plus/blob/487bfa05572a04eb39af260eb3197533ddcdcb0d/whisperplus/pipelines/whisper.py#L72C13-L72C58

I used it on P100 from Kaggle and got the error about Flash Attention

kadirnar commented 4 months ago

Can you share the error message?

kadirnar commented 4 months ago

I added flash-attention2 as a parameter for you to turn off. You can look at Readme.