alphacep / vosk-android-demo

Offline speech recognition for Android with Vosk library.
Apache License 2.0
756 stars 206 forks source link

Gettting the vad timing #148

Open rafiuddinkhan opened 3 years ago

rafiuddinkhan commented 3 years ago

The android build for VOSK api is working great for speech to text.

I want to have the start time and end time for speech buffer for chunking, if there is any way around.

Currently we get the start-time and end-time in seconds for the word predicted by vosk but if it mis-predict we will get the wrong start-time and end-time.

Thanks,

nshmyrev commented 3 years ago

We don't have separate VAD, you can only get word times.

rafiuddinkhan commented 3 years ago

@nshmyrev the VAD-with-noise adaptation would really enhance the accuracy as the current STT model predicts some unknown words when there is noise in the background.

This VAD is running on mobile has potential: https://github.com/SIP-Lab/CNN-VAD

Is there any way to do punctuation or post-processing for getting proper formatted sentence aster VOSK output?