alumae / kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
BSD 2-Clause "Simplified" License
1.07k stars 341 forks source link

Question on server side audio #129

Closed feddyfedfed closed 4 years ago

feddyfedfed commented 6 years ago

Is there a way to check the audio that is actually used by the server side during decoding?

My configuration makes use of the sample librispeech yaml file provided (using gst-kaldi-nnet2-online), with models from our training. I compare the results of our evaluations using online2-wav-nnet2-latgen without rescoring and I'm getting a 2% discrepancy. But aside from this, some of the hypotheses from gst-kaldi-nnet2-online really tell me that the server seems to be corrupting the audio files.

Thank you very much!