Open cdgraff opened 1 year ago
Hi @nshmyrev do you have some advice? is something wrong in my setup? I tested with CPU, same model, and works without issue, with same implementation of code... but with GPU arrive to same issue in all the tests... thanks in advance!
Do you close connection before closing results without sending eof? I need to reproduce this thing somehow.
connection closed
message worries me
Hi @nshmyrev, I'm working with @cdgraff on this particular implementation, we are using a nodejs PassThrough to read the ffmpg output and feed it via websocket to the Vosk server. Our logs from node look something similar to this:
starting
sending chunk
... <- a bunch of chunks
sending chunk
sending chunk
sending eof
closing websocket
Let me know if this answers your question. Thanks!
Hi!
I'm having the same issues, but I'm using the python lib. I'm using the asr_server_gpu.py from thus repo with running in Docker (using this image: alphacep/kaldi-vosk-server-gpu:latest).
From my debugging the problem occour when we start to close the websocket connection and the function FinishStream() of the BatchRecognizer get called.
Here is the error:
[Thread 0x73ae8288c640 (LWP 1751) exited]
[Thread 0x73ae5affd640 (LWP 1752) exited]
LOG ([5.5.1089~1-a25f2]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
[New Thread 0x73ae5affd640 (LWP 1753)]
[New Thread 0x73ae8288c640 (LWP 1754)]
[New Thread 0x73ae80888640 (LWP 1755)]
[New Thread 0x73ae8188a640 (LWP 1756)]
[New Thread 0x73b012fde640 (LWP 1757)]
[New Thread 0x73ae8208b640 (LWP 1758)]
[New Thread 0x73ae81089640 (LWP 1759)]
[New Thread 0x73ae5bfff640 (LWP 1760)]
[New Thread 0x73ae5b7fe640 (LWP 1761)]
[New Thread 0x73ae58db1640 (LWP 1762)]
[New Thread 0x73ae3cb69640 (LWP 1763)]
INFO:websockets.server:server listening on 0.0.0.0:2700
INFO:websockets.server:connection open
INFO:root:Connection from ('10.36.2.192', 35730)
INFO:root:Config {'sample_rate': 16000}
INFO:websockets.server:connection closed
Thread 523 "python3" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x73ae80888640 (LWP 1755)]
0x000073b08a937b36 in BatchRecognizer::PushLattice(fst::VectorFst<fst::ArcTpl<fst::CompactLatticeWeightTpl<fst::LatticeWeightTpl<float>, int> >, fst::VectorState<fst::ArcTpl<fst::CompactLatticeWeightTpl<fst::LatticeWeightTpl<float>, int> >, std::allocator<fst::ArcTpl<fst::CompactLatticeWeightTpl<fst::LatticeWeightTpl<float>, int> > > > >&, float) () from /usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so
Here the backtrace took from gdb:
(gdb) bt
#0 0x000073b08a937b36 in BatchRecognizer::PushLattice(fst::VectorFst<fst::ArcTpl<fst::CompactLatticeWeightTpl<fst::LatticeWeightTpl<float>, int> >, fst::VectorState<fst::ArcTpl<fst::CompactLatticeWeightTpl<fst::LatticeWeightTpl<float>, int> >, std::allocator<fst::ArcTpl<fst::CompactLatticeWeightTpl<fst::LatticeWeightTpl<float>, int> > > > >&, float) () from /usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so
#1 0x000073b08a949f81 in kaldi::cuda_decoder::BatchedThreadedNnet3CudaOnlinePipeline::FinalizeDecoding(int) () from /usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so
#2 0x000073b08a93e5a5 in kaldi::cuda_decoder::ThreadPoolLightWorker::Work() () from /usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so
#3 0x000073b04d6b22b3 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x000073b08f5bdac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#5 0x000073b08f64ea04 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
If we delete the FinishStream() call the server works, but the used memory increase really fast without ever going down, I think is bacause the memory user by the recognizer never get released.
I tried to implement the same think starting from the C++ server (but using the batch model and recognizer) and the same error occour.
Can you help me solve this issue?
Thanks!
There is race condition in kaldi here:
I'll try to fix coming days.
Wonderful! Thanks for the fast reply :)
Hi! Can help me to identify what i'm doing wrong that after some transcriptions i got an Segmentation fault (core dumped)
I sent audio chunks of 30 seconds to transcribe, one after the other... in some cases we split into multiple workers as you can see bellow, using 3 workers.
The path for the server is created dynamically to be uniq by chunk.