KoljaB / RealtimeSTT

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
MIT License
2.09k stars 190 forks source link

the on_realtime_transcription_update text issue #69

Closed jacobtang closed 3 months ago

jacobtang commented 5 months ago

Hi,@KoljaB I use the on_realtime_transcription_update(also try the on_realtime_transcription_stabilized) text,send to the web client,but it always show the pre text word. Is there other method that can avoid show the repeated pre words?Thanks! image I try to implement live transcription function similar to Zoom Meeting in the product.

KoljaB commented 5 months ago

No, that ist due to the way Whisper works and how it therefore is implemented. You'd need to program this on top of that.

jacobtang commented 5 months ago

Ok,thanks a lot! I use the process_text callback data, it costs about 3s to get the text data, is it normal? how should I reduce the time cost? use the recorder_config,and recorder.feed_audio(audio_chunk) method in GPU env. image

KoljaB commented 5 months ago

Absolutely not normal. Even with largest model or on CPU transcription time should be way below 1s. I have no real idea, why this happens. Can you please try another model, like small.en or medium? Also a separated test with only faster_whisper library could tell us if it's the transcription or maybe something related to the VAD models.