Closed iak-a-dev closed 4 days ago
Hi @iak-a-dev it might be possible if you are creative, although it wouldn't be very straightforward. You could set "redemptionFrames" to the shortest pause length you're interested in and have onSpeechEnd callback start a timer that alerts you the next pause duration you're interested in (and can be interrupted by onSpeechStart callback). But if it gets too complex or is buggy, you may be better off creating your own solution. You may also want to consider streaming audio to your server via websocket or webrtc
@ricky0123 I had the same idea myself but decided to ask for better solutions. Unfortunately, the standard Whisper does not support volume streaming, so I have to experiment. Thanks for your answer!
I need to understand if this package can be used for voice activity detection in my project, where I want to trigger different actions based on varying pause lengths. Is it possible to achieve this functionality with this package, and if so, how can I set up different actions for different pause durations?
My goal is to use the VAD to identify small pauses and based on them to cut the audio stream into pieces and thus start transcribing it before the speech is finished