alumae / kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
BSD 2-Clause "Simplified" License
1.07k stars 341 forks source link

Even with the right GST_PLUGIN_PATH, the kaldinnet2onlinedecoder and onlinegmmdecodefaster decoders do not run correctly #135

Open louistiti opened 6 years ago

louistiti commented 6 years ago

Hello,

First of all, thanks for providing this cool server based on Kaldi.

I correctly installed Kaldi and I built the two decoders kaldinnet2onlinedecoder and onlinegmmdecodefaster. Also, for each of these decoders, I correctly export the GST_PLUGIN_PATH env variable.

The result of $ gst-inspect-1.0 for the kaldinnet2onlinedecoder decoder is:

kaldinnet2onlinedecoder:  kaldinnet2onlinedecoder: KaldiNNet2OnlineDecoder
coreelements:  streamiddemux: Streamid Demux
coreelements:  valve: Valve element
coreelements:  multiqueue: MultiQueue
coreelements:  typefind: TypeFind
coreelements:  tee: Tee pipe fitting
coreelements:  filesink: File Sink
coreelements:  queue2: Queue 2
coreelements:  queue: Queue
coreelements:  output-selector: Output selector
coreelements:  input-selector: Input selector
coreelements:  identity: Identity
coreelements:  funnel: Funnel pipe fitting
coreelements:  filesrc: File Source
coreelements:  fdsink: Filedescriptor Sink
coreelements:  fdsrc: Filedescriptor Source
coreelements:  fakesink: Fake Sink
coreelements:  fakesrc: Fake Source
coreelements:  downloadbuffer: DownloadBuffer
coreelements:  dataurisrc: data: URI source element
coreelements:  concat: Concat
coreelements:  capsfilter: CapsFilter
coretracers:  leaks (GstTracerFactory)
coretracers:  stats (GstTracerFactory)
coretracers:  rusage (GstTracerFactory)
coretracers:  log (GstTracerFactory)
coretracers:  latency (GstTracerFactory)
staticelements:  bin: Generic bin
staticelements:  pipeline: Pipeline object

Total count: 4 plugins, 29 features

And the result for the onlinegmmdecodefaster decoder is:

onlinegmmdecodefaster:  onlinegmmdecodefaster: OnlineGmmDecodeFaster
coreelements:  capsfilter: CapsFilter
coreelements:  concat: Concat
coreelements:  dataurisrc: data: URI source element
coreelements:  downloadbuffer: DownloadBuffer
coreelements:  fakesrc: Fake Source
coreelements:  fakesink: Fake Sink
coreelements:  fdsrc: Filedescriptor Source
coreelements:  fdsink: Filedescriptor Sink
coreelements:  filesrc: File Source
coreelements:  funnel: Funnel pipe fitting
coreelements:  identity: Identity
coreelements:  input-selector: Input selector
coreelements:  output-selector: Output selector
coreelements:  queue: Queue
coreelements:  queue2: Queue 2
coreelements:  filesink: File Sink
coreelements:  tee: Tee pipe fitting
coreelements:  typefind: TypeFind
coreelements:  multiqueue: MultiQueue
coreelements:  valve: Valve element
coreelements:  streamiddemux: Streamid Demux
coretracers:  latency (GstTracerFactory)
coretracers:  log (GstTracerFactory)
coretracers:  rusage (GstTracerFactory)
coretracers:  stats (GstTracerFactory)
coretracers:  leaks (GstTracerFactory)
staticelements:  bin: Generic bin
staticelements:  pipeline: Pipeline object

Total count: 4 plugins, 29 features

According to these results, it sounds like the decoders are correctly setup, except if I missed something here?

However, when I run the following command: $ python kaldigstserver/worker.py -u ws://localhost:8888/worker/ws/speech -c sample_english_nnet2.yaml, (with the correct GST_PLUGIN_PATH for the kaldinnet2onlinedecoder decoder) I have this output:

   DEBUG 2018-06-02 10:59:46,730 Starting up worker 
2018-06-02 10:59:46 -    INFO:   decoder2: Creating decoder using conf: {'post-processor': "perl -npe 'BEGIN {use IO::Handle; STDOUT->autoflush(1);} s/(.*)/\\1./;'", 'logging': {'version': 1, 'root': {'level': 'DEBUG', 'handlers': ['console']}, 'formatters': {'simpleFormater': {'datefmt': '%Y-%m-%d %H:%M:%S', 'format': '%(asctime)s - %(levelname)7s: %(name)10s: %(message)s'}}, 'disable_existing_loggers': False, 'handlers': {'console': {'formatter': 'simpleFormater', 'class': 'logging.StreamHandler', 'level': 'DEBUG'}}}, 'use-nnet2': True, 'full-post-processor': './sample_full_post_processor.py', 'decoder': {'ivector-extraction-config': 'english/tedlium_nnet_ms_sp_online/conf/ivector_extractor.conf', 'num-nbest': 10, 'lattice-beam': 6.0, 'acoustic-scale': 0.083, 'do-endpointing': True, 'beam': 10.0, 'max-active': 10000, 'fst': 'english/tedlium_nnet_ms_sp_online/HCLG.fst', 'mfcc-config': 'english/tedlium_nnet_ms_sp_online/conf/mfcc.conf', 'use-threaded-decoder': True, 'traceback-period-in-secs': 0.25, 'model': 'english/tedlium_nnet_ms_sp_online/final.mdl', 'word-syms': 'english/tedlium_nnet_ms_sp_online/words.txt', 'endpoint-silence-phones': '1:2:3:4:5:6:7:8:9:10', 'chunk-length-in-secs': 0.25}, 'silence-timeout': 10, 'out-dir': 'tmp', 'use-vad': False}
2018-06-02 10:59:46 -    INFO:   decoder2: Setting decoder property: ivector-extraction-config = english/tedlium_nnet_ms_sp_online/conf/ivector_extractor.conf
2018-06-02 10:59:46 -    INFO:   decoder2: Setting decoder property: num-nbest = 10
2018-06-02 10:59:46 -    INFO:   decoder2: Setting decoder property: lattice-beam = 6.0
2018-06-02 10:59:46 -    INFO:   decoder2: Setting decoder property: acoustic-scale = 0.083
2018-06-02 10:59:46 -    INFO:   decoder2: Setting decoder property: do-endpointing = True
2018-06-02 10:59:46 -    INFO:   decoder2: Setting decoder property: beam = 10.0
2018-06-02 10:59:46 -    INFO:   decoder2: Setting decoder property: max-active = 10000
2018-06-02 10:59:46 -    INFO:   decoder2: Setting decoder property: mfcc-config = english/tedlium_nnet_ms_sp_online/conf/mfcc.conf
2018-06-02 10:59:46 -    INFO:   decoder2: Setting decoder property: traceback-period-in-secs = 0.25
2018-06-02 10:59:46 -    INFO:   decoder2: Setting decoder property: word-syms = english/tedlium_nnet_ms_sp_online/words.txt
2018-06-02 10:59:47 -    INFO:   decoder2: Setting decoder property: endpoint-silence-phones = 1:2:3:4:5:6:7:8:9:10
2018-06-02 10:59:47 -    INFO:   decoder2: Setting decoder property: chunk-length-in-secs = 0.25
2018-06-02 10:59:47 -    INFO:   decoder2: Setting decoder property: fst = english/tedlium_nnet_ms_sp_online/HCLG.fst
2018-06-02 10:59:59 -    INFO:   decoder2: Setting decoder property: model = english/tedlium_nnet_ms_sp_online/final.mdl
Traceback (most recent call last):
  File "kaldigstserver/worker.py", line 419, in <module>
    main()
  File "kaldigstserver/worker.py", line 409, in main
    decoder_pipeline = DecoderPipeline2(conf)
  File "/home/louis/Workspace/kaldi-gstreamer-server/kaldigstserver/decoder2.py", line 25, in __init__
    self.create_pipeline(conf)
  File "/home/louis/Workspace/kaldi-gstreamer-server/kaldigstserver/decoder2.py", line 75, in create_pipeline
    self.appsrc.set_property("is-live", True)
AttributeError: 'NoneType' object has no attribute 'set_property'

And with the onlinegmmdecodefaster decoder, when I run: $ python kaldigstserver/worker.py -u ws://localhost:8888/worker/ws/speech -c sample_worker.yaml (with the correct GST_PLUGIN_PATH for this decoder), I have this output:

   DEBUG 2018-06-02 11:01:49,907 Starting up worker 
2018-06-02 11:01:49 -    INFO:    decoder: Creating decoder using conf: {'timeout-decoder': 10, 'post-processor': "perl -npe 'BEGIN {use IO::Handle; STDOUT->autoflush(1);} s/(.*)/\\1./;'", 'logging': {'version': 1, 'root': {'level': 'DEBUG', 'handlers': ['console']}, 'formatters': {'simpleFormater': {'datefmt': '%Y-%m-%d %H:%M:%S', 'format': '%(asctime)s - %(levelname)7s: %(name)10s: %(message)s'}}, 'disable_existing_loggers': False, 'handlers': {'console': {'formatter': 'simpleFormater', 'class': 'logging.StreamHandler', 'level': 'DEBUG'}}}, 'decoder': {'word-syms': 'test/models/english/voxforge/tri2b_mmi_b0.05/words.txt', 'model': 'test/models/english/voxforge/tri2b_mmi_b0.05/final.mdl', 'lda-mat': 'test/models/english/voxforge/tri2b_mmi_b0.05/final.mat', 'fst': 'test/models/english/voxforge/tri2b_mmi_b0.05/HCLG.fst', 'silence-phones': '1:2:3:4:5'}, 'silence-timeout': 60, 'out-dir': 'tmp', 'use-vad': False}
2018-06-02 11:01:49 -    INFO:    decoder: Setting decoder property: word-syms = test/models/english/voxforge/tri2b_mmi_b0.05/words.txt
2018-06-02 11:01:49 -    INFO:    decoder: Setting decoder property: model = test/models/english/voxforge/tri2b_mmi_b0.05/final.mdl
2018-06-02 11:01:49 -    INFO:    decoder: Setting decoder property: lda-mat = test/models/english/voxforge/tri2b_mmi_b0.05/final.mat
2018-06-02 11:01:49 -    INFO:    decoder: Setting decoder property: fst = test/models/english/voxforge/tri2b_mmi_b0.05/HCLG.fst
2018-06-02 11:01:49 -    INFO:    decoder: Setting decoder property: silence-phones = 1:2:3:4:5
Traceback (most recent call last):
  File "kaldigstserver/worker.py", line 419, in <module>
    main()
  File "kaldigstserver/worker.py", line 411, in main
    decoder_pipeline = DecoderPipeline(conf)
  File "/home/louis/Workspace/kaldi-gstreamer-server/kaldigstserver/decoder.py", line 25, in __init__
    self.create_pipeline(conf)
  File "/home/louis/Workspace/kaldi-gstreamer-server/kaldigstserver/decoder.py", line 57, in create_pipeline
    self.appsrc.set_property("is-live", True)
AttributeError: 'NoneType' object has no attribute 'set_property'

I also specified the right paths in the *.conf files.

Could you please help me to figure out where I missed something?