ricky0123 / vad

Voice activity detector (VAD) for the browser with a simple API
https://www.vad.ricky0123.com
Other
901 stars 143 forks source link

Get audio data in real-time #122

Open t41372 opened 4 months ago

t41372 commented 4 months ago

Is there a way to get the audio data while the speech is active before it ends? I want to get the audio data when the speech starts, stream it to the back end in real-time and stop streaming when it ends. It seems like the onFrameProcessed callback only has a probability property. Thanks

rahulbansal16 commented 3 months ago

Which model will you use in the backend to transcribe it? Does the word error rate increase in that way?

JettScythe commented 1 month ago

bumping as this is exactly what I need. I already have instances of Whisper that are available for transcription / translation on the back end - but reducing latency in a response means getting chunks transcribed as they appear. I suspect a reasonable "chunk" is one sentence.