alumae / kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
BSD 2-Clause "Simplified" License
1.07k stars 341 forks source link

Silence detection #39

Closed yochze closed 8 years ago

yochze commented 8 years ago

Hi,

I am using TED dnn model. I tried running a file through the decoder and it worked. However, the utterances (or segments) are not splitting based on silences. sometime it even breaks in the middle of a spoken sentence. I think it breaks when it reaches to a certain word limit.

do-endpointing is set to True and I tried playing with endpointing-silence-phones (currently 1:2:3:4:5), but no luck

alumae commented 8 years ago

Input is broken into segments based on recognized silence regions. There is no word limit. It could be that sometimes the silence is simply misrecognized.