alphacep / vosk-server

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Apache License 2.0
882 stars 243 forks source link

With three clients running at the same time, the kaldi-en-gpu container stops. #178

Open zxl777 opened 2 years ago

zxl777 commented 2 years ago

docker run --gpus=all -p 2700:2700 alphacep/kaldi-en-gpu

Running only one client, it completed successfully after 4 minutes. cd /root/vosk-server/websocket && time ./test.py test2.wav

Running three clients at the same time, an error is reported after 1 minute. cd /root/vosk-server/websocket && time ./test.py test2.wav cd /root/vosk-server/websocket && time ./test.py test2.wav cd /root/vosk-server/websocket && time ./test.py test2.wav

docker logs

INFO:root:Connection from ('172.17.0.1', 45382)
INFO:root:Config {'sample_rate': 44100}
INFO:root:Connection from ('172.17.0.1', 45406)
INFO:root:Config {'sample_rate': 44100}
INFO:root:Connection from ('172.17.0.1', 45410)
INFO:root:Config {'sample_rate': 44100}
INFO:root:Connection from ('172.17.0.1', 45414)
INFO:root:Config {'sample_rate': 44100}
WARNING ([5.5.1027~1-59386]:OutputArcForce():word-align-lattice.cc:577) Invalid word at end of lattice [partial lattice, forced out?]
WARNING ([5.5.1027~1-59386]:OutputSilenceArc():word-align-lattice.cc:366) Phone changed before final transition-id found [broken lattice or mismatched model or wrong --reorder option?]
WARNING ([5.5.1027~1-59386]:OutputSilenceArc():word-align-lattice.cc:366) Phone changed before final transition-id found [broken lattice or mismatched model or wrong --reorder option?]
zxl777 commented 2 years ago

Looking forward to someone finally fixing this.

cdgraff commented 1 year ago

the problem is related as all the audios use the same websocket endpoint and that is not possible, you need modify the path to each audio use his own path... when you send multiple audios into the same path, brake the time logic of the audio flow... and that is the reason of the stop.

examples script only be prepared to run 1 audio at a time... if you need multiple audios, need modify the example