alumae / kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
BSD 2-Clause "Simplified" License
1.07k stars 341 forks source link

poor VAD performance #144

Open Umar17 opened 6 years ago

Umar17 commented 6 years ago

Dear Alumae,

I have changed VAD threshold to 0.04 in decoder.py for gst cutter. But some garbage is still being decoded for signal below this threshold (RMS amplitude of about 0.002) in start of utterance. Have you checked whether this VAD works properly or not? Or Am I making some mistake?

Or can I get response with some chunk identification? (what was result of which packet?)

Best Regards

yangxueruivs commented 5 years ago

Have you solved this problem? I also got this when VAD set to relatively low.

Umar17 commented 5 years ago

I have shifted HMM based acoustic model to Neural Net based which has back tracing feature as well and improves garbage decoding to a reasonable utterance. So I didn't need to focus on VAD afterwards.