huggingface / distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
MIT License
3.32k stars 238 forks source link

Best way to implement streaming application? #89

Open 9throok opened 4 months ago

9throok commented 4 months ago

Hi @sanchit-gandhi, I was working on and around whisper and its applications for quite some time. Although its been a while since I last touched this topic, there have been so much development after distil-whisper and V3 were announced.

I come here after visiting resources, forums, discussions and threads like these:

If, lets say I were to build some thing like this, what will be the best way to do it, considering only short audio chunks?

It would be great help if you could share your insights on it.