Getting completely wrong result for my nnet2 model

boleamol commented 5 years ago

Hi all,

I build my own nnet2 acoustic model and trying to test with gui-demo.py but getting completely (100%) wrong result. For same model while building got 15% WER. I don't know why it is giving completely wrong result while testing live with microphone. Any help is appreciated.

Note: My dictionary size is 1.5 Lakh words.

larissadias commented 5 years ago

@boleamol I'm having the same problem. I trained a nnet2 acoustic model( using train_pnorm_fast training script) with iVectors, but although the model achieved 8% WER during decode, I'm getting completely nonsense words as result when I try to test it with online decoding. For example, for a one-minute-long audio wav file, the recognizer outputs only two short words that make no sense at all.

I'm wondering if there might have been a problem with the adaptation I've made at the RM scripts to my own corpora. What Kaldi recipe did you follow to train your acoustic model with iVector extraction?

nshmyrev commented 5 years ago

@larissadias Mini_librispeech should be the most recent and up-to-date recipe. You need nnet3 model. For more help you need to provide more details!

alumae / gst-kaldi-nnet2-online

Getting completely wrong result for my nnet2 model #83