alumae / kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
BSD 2-Clause "Simplified" License
1.07k stars 342 forks source link

No speech input detected #204

Open Johe-cqu opened 4 years ago

Johe-cqu commented 4 years ago

Hi,everyone: I use a custom model. After executing start.sh and input audio, system will crash and return "No speech input detected" . And I found some error logs in worker.log .

2019-08-23 12:30:03 -    INFO:   decoder2: Setting decoder property: model = /opt/models/chinese/aishell_nnet_online/final.mdl
ERROR ([5.4.176~1-be967]:ExpectToken():io-funcs.cc:212) Expected token "<BlockDim>", got instead "<BackpropScale>".

[ Stack-Trace: ]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::MessageLogger::~MessageLogger()
kaldi::ExpectToken(std::istream&, bool, char const*)
kaldi::nnet3::NonlinearComponent::Read(std::istream&, bool)
kaldi::nnet3::Component::ReadNew(std::istream&, bool)
kaldi::nnet3::Nnet::Read(std::istream&, bool)
kaldi::nnet3::AmNnetSimple::Read(std::istream&, bool)

g_object_set_property

.
.
.
python() [0x4b988b]
PyEval_EvalFrameEx
PyEval_EvalFrameEx
PyEval_EvalCodeEx
python() [0x50160f]
PyRun_FileExFlags
PyRun_SimpleFileExFlags
Py_Main
__libc_start_main
python() [0x497b8b]

2019-08-23 12:30:03 -    INFO:   decoder2: Created GStreamer elements
2019-08-23 12:30:03 -   DEBUG:   decoder2: Adding <__main__.GstAppSrc object at 0x7f2f17cd4410 (GstAppSrc at 0x28261b0)> to the pipeline
2019-08-23 12:30:03 -   DEBUG:   decoder2: Adding <__main__.GstDecodeBin object at 0x7f2f17cd43c0 (GstDecodeBin at 0x2812090)> to the pipeline
2019-08-23 12:30:03 -   DEBUG:   decoder2: Adding <__main__.GstAudioConvert object at 0x7f2f17cd44b0 (GstAudioConvert at 0x2835100)> to the pipeline
2019-08-23 12:30:03 -   DEBUG:   decoder2: Adding <__main__.GstAudioResample object at 0x7f2f17cd4370 (GstAudioResample at 0x26f2610)> to the pipeline
2019-08-23 12:30:03 -   DEBUG:   decoder2: Adding <__main__.GstTee object at 0x7f2f17cd4460 (GstTee at 0x2845000)> to the pipeline
2019-08-23 12:30:03 -   DEBUG:   decoder2: Adding <__main__.GstQueue object at 0x7f2f17cd4550 (GstQueue at 0x284a210)> to the pipeline
2019-08-23 12:30:03 -   DEBUG:   decoder2: Adding <__main__.GstFileSink object at 0x7f2f17cd45a0 (GstFileSink at 0x284de00)> to the pipeline
2019-08-23 12:30:03 -   DEBUG:   decoder2: Adding <__main__.GstQueue object at 0x7f2f17cd45f0 (GstQueue at 0x284a500)> to the pipeline
2019-08-23 12:30:03 -   DEBUG:   decoder2: Adding <__main__.Gstkaldinnet2onlinedecoder object at 0x7f2f17cd4640 (Gstkaldinnet2onlinedecoder at 0x28700a0)> to the pipeline
2019-08-23 12:30:03 -   DEBUG:   decoder2: Adding <__main__.GstFakeSink object at 0x7f2f17cd4690 (GstFakeSink at 0x274aa00)> to the pipeline
2019-08-23 12:30:03 -    INFO:   decoder2: Linking GStreamer elements
LOG ([5.4.176~1-be967]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG ([5.4.176~1-be967]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
2019-08-23 12:30:04 -    INFO:   decoder2: Setting pipeline to READY
2019-08-23 12:30:04 -    INFO:   decoder2: Set pipeline to READY
2019-08-23 12:30:04 -    INFO:   __main__: Opening websocket connection to master server
2019-08-23 12:30:04 -    INFO:   __main__: Opened websocket connection to server
2019-08-23 12:33:44 -   DEBUG:   __main__: <undefined>: Got message from server of type <class 'ws4py.messaging.TextMessage'>
2019-08-23 12:33:44 -    INFO:   decoder2: c55de094-8ee2-4265-bd58-9a3512574af9: Initializing request
2019-08-23 12:33:44 -    INFO:   decoder2: c55de094-8ee2-4265-bd58-9a3512574af9: Setting caps to audio/x-raw, layout=(string)interleaved, rate=(int)16000, format=(string)S16LE, channels=(int)1
2019-08-23 12:33:44 -    INFO:   decoder2: c55de094-8ee2-4265-bd58-9a3512574af9: Connecting audio decoder
2019-08-23 12:33:44 -    INFO:   __main__: c55de094-8ee2-4265-bd58-9a3512574af9: Started timeout guard
2019-08-23 12:33:44 -    INFO:   __main__: c55de094-8ee2-4265-bd58-9a3512574af9: Initialized request
2019-08-23 12:33:44 -   DEBUG:   __main__: c55de094-8ee2-4265-bd58-9a3512574af9: Checking that decoder hasn't been silent for more than 10 seconds
2019-08-23 12:33:44 -    INFO:   decoder2: c55de094-8ee2-4265-bd58-9a3512574af9: Connected audio decoder
2019-08-23 12:33:44 -   DEBUG:   __main__: c55de094-8ee2-4265-bd58-9a3512574af9: Got message from server of type <class 'ws4py.messaging.BinaryMessage'>
2019-08-23 12:33:44 -   DEBUG:   decoder2: c55de094-8ee2-4265-bd58-9a3512574af9: Pushing buffer of size 5120 to pipeline
2019-08-23 12:33:44 -   DEBUG:   decoder2: c55de094-8ee2-4265-bd58-9a3512574af9: Pushing buffer done

I want to know where the error occurred, anybody can help me?

GeorgeTsaplin commented 3 years ago

Same issue with VOSK model for Russian language (https://alphacephei.com/vosk/models/vosk-model-ru-0.10.zip)