saharmor / whisper-playground

Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/
MIT License
777 stars 140 forks source link

Fix hallucinations + bugfixes #40

Closed ethanzrd closed 1 year ago

ethanzrd commented 1 year ago

Upgraded the system with VAD, ensuring that only speech-containing chunks make it into the queue.

Note: with a 5-second transcription timeout, the transcription triggers for every 5 seconds of speech, not every 5 seconds.

Fixed end-of-stream behavior for the sequential mode.