ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
35.49k stars 3.62k forks source link

[Discussion - Improvements] - Real-time (or near-real-time) transcription in the browser with React #1976

Open avie41 opened 7 months ago

avie41 commented 7 months ago

Hello @ggerganov

I managed to reuse the code from the stream example and integrate it into a React application using Vite.js.

Keeping the basic implementation, adapted in TypeScript, I have a latency of about 1.5, 2 seconds on average.

But it looks like the implementation given in the example presents a fairly basic audio chunking strategy that could be improved.

Additional context:

At the moment, my application uses vosk-browser, which plugs into an Audio streamer. I would like to turn to Whisper for its superior transcription quality and would like to optimize my implementation as much as possible to get closer to realtime with whisper.cpp.

qxprakash commented 7 months ago

hi @avie41 can you share the code of your implementation , I wanted to get streaming to work in the whisper.cpp server.