Open powellnorma opened 1 month ago
@powellnorma Thanks for using the library. I think you make a good point, we can bring this feature in an upcoming release.
i've just got my custom fast-whisper model working on a docker server and am looking where i can implement this myself. i haven't changed volume threshold settings for VAD yet but i get a lot of junk tokens. with slow whisper i implemented a black list for phrases like "Thank you", "Thanks very much", etc that get thrown out by the model. I think i can see where to look at transcribe() in transcriber.py to maybe select phrases and so expose them but the process seems expensive so i might need to look further.
Looking at the code, I don't see how the library user is supposed to access the transcribed text? It looks like it just gets printed?
https://github.com/collabora/WhisperLive/blob/e1a42c22d2de65303ec34f54805ade0e84a80d09/whisper_live/client.py#L123
I think a workaround would be to read the
output.srt
- But maybe we could also just return the transcribed text as string?