alphacep / vosk-server

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Apache License 2.0
896 stars 243 forks source link

Error while running the python version of server and client #132

Open lucgeo opened 3 years ago

lucgeo commented 3 years ago

Hello,

I'm using the python vosk server version inside the latest docker container (image id: c98d01ef2758; sha256: 90bf6aaffa5dacea742f8891b73d7c5e61f5ae383fb9b019dfa2c50be43217f9). I'm calling the python client from a bash script to loop through a list of audio files which I want to decode. Everything works well, receiving the transcripts for a part of my audio files, but a a certain moment, the server crashes with the following error:

LOG (VoskAPI:CompileLooped():nnet-compile-looped.cc:345) Spent 0.00443506 seconds in looped compilation. WARNING (VoskAPI:Compose():compose-lattice-pruned.cc:942) Composed lattice has no states: something went wrong. WARNING (VoskAPI:AlignLattice():word-align-lattice.cc:310) Trying to word-align empty lattice. ASSERTION_FAILED (VoskAPI:CompactLatticeStateTimes():lattice-functions.cc:114) Assertion failed: (lat.Start() == 0) Aborted (core dumped)

This is the first time when the vosk server crashes, I used it in the past (a bit older version) with the same audio files and everything was fine. Another new thing now it represents the RNN rescoring which I'm using. Any ideas please?

Thanks, Lucian

nshmyrev commented 3 years ago

Maybe you can find and share the problematic file

lucgeo commented 3 years ago

Hi,

I identified the file which causes the problem. I'm attaching it here. On the client side I'm receiving several partial results until the stream gets closed on the server side. Could it be because in this 28-seconds audio file the speaker does not make any pauses?

Update: There is no problem if I'm performing only decoding, without rescoring.

Thanks!