Is it possible to flush and get the final result quicker?

aj3423 commented 1 year ago

When test with some .wav file, send its content to docker "alphacep/kaldi-en" to get the result. The problem is that even the .wav duration is 0.7 second, it takes 2 seconds to get the final result.

And it takes 2.46 seconds with another file test16k.wav, which has a duration of 8 seconds. (files attached)

Files with different duration have similar recognition time, so I guess the 2.x seconds is not cost by the recognition but by waiting for the further voice input.

I tried to send the {"eof" : 1} right after the .wav data, but it changes nothing, my code looks like:

ws.WriteMessage(websocket.BinaryMessage, wav_binary) // send full .wav binary data to docker websocket
ws.WriteMessage(websocket.TextMessage, []byte(`{"eof" : 1}`))

_, msg, err := ws.ReadMessage() // <---- this takes 2+ seconds
check(err)

I want to write an application that when user press down space-key, it starts voice recognition, and stops when user release space-key. So when the key is released I want to send some signal to VOSK to tell it please stop waiting for further data and return the current result.

It it possible? Thanks.

Environment: Linux, Docker, GoLang CPU: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz two_wav_file.zip

nshmyrev commented 1 year ago

Do you open a new connection for this short file of 0.7 second? There might be some initialization delays on start, we are working on them.

aj3423 commented 1 year ago

Sorry It's my bad, I allocated a large buffer and send too much trailing bytes, that caused the long recognition time.

alphacep / vosk-server

Is it possible to flush and get the final result quicker? #199