Closed abhishek-tg closed 1 week ago
Just realized that the current client does not record and send chunks.
This is a useful feature and very needed. Need to think about how to integrate taking chunks into the API, will then provide a JS client.
Also it uses pyaudio input stream which will be changed to socket queue or something.
Available now, please check this example with the new v0.1.8 Version.
Thanks i was able to modify it and use it frequently, however I have a question when using for multiple users how will i handle a recorder thread?, will we have multiple threads or a unique id to distinguish the speech classified between users.
Depends on what you want to achieve. Handling multiple user inputs in parallel will be not easy, especially if you want to also realtime transcribe. First you'd need to change RealtimeSTT for this, the processing is not designed for multiple incoming audio chunk feeds. You would need to create multiple worker threads for every feed. While a user talks the server needs to do voice activity detection and transcription, which needs VRAM and causes load on GPU. So either you'd need to load balance VAD and transcription somehow or you'd need big amounts of VRAM and GPU power on the server to handle that.
I was just trying to create a web app and wanted to modify this to use it to a web app like from JS. Is there a sample?