thomasmol / cog-whisper-diarization

Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote
https://replicate.com/thomasmol/whisper-diarization
165 stars 51 forks source link

[FEATURE REQUEST] Flash Attention 2 #5

Closed pointia closed 7 months ago

pointia commented 7 months ago

Hi,

have you ever thought about implementing Flash Attention 2 as well, like here?

https://github.com/Vaibhavs10/insanely-fast-whisper

thomasmol commented 7 months ago

yes i have. however, the repo you're linking uses the huggingface api of whisper. unfortunately the huggingface api does not (yet) provide some features like an input for initial_prompt and returning the detected language. once it does, i will probably switch since it is much faster.

NickNaskida commented 1 month ago

Actually, I believe it is possible, judging from these links:

https://huggingface.co/openai/whisper-large-v3/discussions/128#666af9e88b12eb564be39cdb https://huggingface.co/distil-whisper/distil-large-v3/discussions/6#666aed28491c9ff1e783d7de

NickNaskida commented 1 month ago

Btw, same for language detection:

https://huggingface.co/openai/whisper-large-v3/discussions/15#6554e55dffc7b280f57faf78