how to run decoding on chain model built by myself

kelvinqin commented 3 years ago

Guenter, Thanks for your work, I successfully compiled and tested your code. And also I can do decoding using your pre-trained model (kaldi-generic-en-tdnn_f-r20190609) now. Great work!

Meanwhile, I am learning Kaldi, and almost finished my first model building using one of Mandarin recipes (aishell). And I just want to know whether it is possible to run decoding task using your code with my own model?

My model is a chain model which was built following the standard recipe (kaldi/egs/aishell/s5/local/chain/run_tdnn.sh)

One possible difficulty I realized is how to collect all the needed data files into model directory, please advice if there is a instruction document on which files are needed and the corresponding source directory of kaldi. By reading your code, seems just collect all the files mentioned in the following code:

    ****cdef unicode mfcc_config           = u'%s/conf/mfcc_hires.conf'                  % self.modeldir
    cdef unicode word_symbol_table     = u'%s/%s/graph/words.txt'                    % (self.modeldir, self.model)
    cdef unicode model_in_filename     = u'%s/%s/final.mdl'                          % (self.modeldir, self.model)
    cdef unicode splice_conf_filename  = u'%s/ivectors_test_hires/conf/splice.conf'  % self.modeldir
    cdef unicode fst_in_str            = u'%s/%s/graph/HCLG.fst'                     % (self.modeldir, self.model)
    cdef unicode align_lex_filename    = u'%s/%s/graph/phones/align_lexicon.int'     % (self.modeldir, self.model)**

    **self.ie_conf_f.write((u"--cmvn-config=%s/conf/online_cmvn.conf\n" % self.modeldir).encode('utf8'))**
    self.ie_conf_f.write((u"--ivector-period=%d\n" % online_ivector_period).encode('utf8'))
    **self.ie_conf_f.write((u"--splice-config=%s\n" % splice_conf_filename).encode('utf8'))**
    **self.ie_conf_f.write((u"--lda-matrix=%s/extractor/final.mat\n" % self.modeldir).encode('utf8'))
    self.ie_conf_f.write((u"--global-cmvn-stats=%s/extractor/global_cmvn.stats\n" % self.modeldir).encode('utf8'))
    self.ie_conf_f.write((u"--diag-ubm=%s/extractor/final.dubm\n" % self.modeldir).encode('utf8'))**
    **self.ie_conf_f.write((u"--ivector-extractor=%s/extractor/final.ie\n" % self.modeldir).encode('utf8'))**
    self.ie_conf_f.write((u"--num-gselect=%d\n" % num_gselect).encode('utf8'))
    self.ie_conf_f.write((u"--min-post=%f\n" % min_post).encode('utf8'))
    self.ie_conf_f.write((u"--posterior-scale=%f\n" % posterior_scale).encode('utf8'))
    self.ie_conf_f.write((u"--max-remembered-frames=1000\n").encode('utf8'))
    self.ie_conf_f.write((u"--max-count=%d\n" % max_count).encode('utf8'))
    self.ie_conf_f.flush()**

Could you kindly elaborate a little about the source directory of those needed files?

Thanks! Kelvin

kelvinqin commented 3 years ago

Guenter, Have figure it out, :-) thanks! Kelvin

svenha commented 3 years ago

@kelvinqin As aishell is a popular recipe, would you mind sharing your solution? :-)

gooofy / py-kaldi-asr

how to run decoding on chain model built by myself #45