alumae / kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
BSD 2-Clause "Simplified" License
1.07k stars 341 forks source link

Optimizing the decoder #77

Closed jin1004 closed 7 years ago

jin1004 commented 7 years ago

I am currently using the gstreamer server with my nnet3 model. I am only getting the best hypothesis from the decoder currently, but that still takes a few seconds to process. Can you give me some ideas on how to optimize the decoder? Any help will be appreciated.

alumae commented 7 years ago

If you need realtime decoding, I recommend to use chain models. It is very easy to get 0.5x realtime decoding speed with them.

jin1004 commented 7 years ago

Thanks for the reply. I am actually using a chain model currently, and even with that it takes a few seconds to process. I was wondering if there are ways to optimize it further.

alumae commented 7 years ago

Maybe your are using client.py and you don't know that it throttles the uploading of the audio file (see the -r parameter), in order to simulate live decoding?

alumae commented 7 years ago

Closing as there is no response to my question.