alumae / kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
BSD 2-Clause "Simplified" License
1.07k stars 341 forks source link

Poor performance with nnet3 TDNN-F model #259

Open LiamLonergan opened 1 year ago

LiamLonergan commented 1 year ago

Hi all,

I'm developing an API for an Irish language ASR service, putting Kaldi models I've developed into production. However, I'm having difficulty reaching the sort of performance with Gstreamer as I get with the normal online2-tcp-nnet3... kaldi binary. Does Gstreamer do any downsampling that might be affecting how the audio is passed to the model? Nothing is jumping out as an obvious issue.

Any help or ideas would be greatly appreciated!