I am using TED dnn model. I tried running a file through the decoder and it worked.
However, the utterances (or segments) are not splitting based on silences. sometime it even breaks in the middle of a spoken sentence. I think it breaks when it reaches to a certain word limit.
do-endpointing is set to True and I tried playing with endpointing-silence-phones (currently 1:2:3:4:5), but no luck
Input is broken into segments based on recognized silence regions. There is no word limit. It could be that sometimes the silence is simply misrecognized.
Hi,
I am using TED dnn model. I tried running a file through the decoder and it worked. However, the utterances (or segments) are not splitting based on silences. sometime it even breaks in the middle of a spoken sentence. I think it breaks when it reaches to a certain word limit.
do-endpointing
is set toTrue
and I tried playing withendpointing-silence-phones (currently 1:2:3:4:5)
, but no luck