JRMeyer / multi-task-kaldi

An example directory for running Multi-Task Learning training on Kaldi neural networks. In Kaldi-speak, this is an egs dir for nnet3 training.
Apache License 2.0
54 stars 16 forks source link

Error when mkgraph #5

Open findmenowhere opened 3 years ago

findmenowhere commented 3 years ago

Hi, I tried to do multitask learning based your code. Here is the problem:

After training process, I need graph to decode the model. But when I use utils/mkgraph, I met this:

tree-info /exp/multilingual/dir-dnn/task-asr/tree tree-info /exp/multilingual/dir-dnn/task-asr/tree make-h-transducer --disambig-syms-out=/exp/multilingual/dir-dnn/graph/disambig_tid.int --transition-scale=1.0 /librispeech/lang_nosp_test_tgsmall/tmp/ilabels_2_1 /exp/multilingual/dir-dnn/task-asr/tree /exp/multilingual/dir-dnn/task-asr/final.mdl ERROR (make-h-transducer[5.5.8~1-fb99]:TupleToTransitionState():transition-model.cc:262) TransitionModel::TupleToTransitionState, tuple not found. (incompatible tree and model?)

[ Stack-Trace: ] /data/home/v-wajiay/HopeStar/Model/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x82c) [0x7f4bd4d072da] make-h-transducer(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x21) [0x4035f5] /data/home/v-wajiay/HopeStar/Model/kaldi/src/lib/libkaldi-hmm.so(kaldi::TransitionModel::TupleToTransitionState(int, int, int, int) const+0xf2) [0x7f4bd5694ee0] /data/home/v-wajiay/HopeStar/Model/kaldi/src/lib/libkaldi-hmm.so(kaldi::GetHmmAsFsa(std::vector<int, std::allocator >, kaldi::ContextDependencyInterface const&, kaldi::TransitionModel const&, kaldi::HTransducerConfig const&, std::unordered_map<std::pair<int, std::vector<int, std::allocator > >, fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl > > > >, kaldi::HmmCacheHash, std::equal_to<std::pair<int, std::vector<int, std::allocator > > >, std::allocator<std::pair<std::pair<int, std::vector<int, std::allocator > > const, fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl > > > >> > >)+0x61f) [0x7f4bd569ea84] /data/home/v-wajiay/HopeStar/Model/kaldi/src/lib/libkaldi-hmm.so(kaldi::GetHTransducer(std::vector<std::vector<int, std::allocator >, std::allocator<std::vector<int, std::allocator > > > const&, kaldi::ContextDependencyInterface const&, kaldi::TransitionModel const&, kaldi::HTransducerConfig const&, std::vector<int, std::allocator >)+0x5fd) [0x7f4bd569f29d] make-h-transducer(main+0x4e4) [0x402fda] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f4bd43bf830] make-h-transducer(_start+0x29) [0x402a29]

I use the final.mdl under the specific task's folder. Also I found that there is some different code in mkgraph.sh of your project and official kaldi code. Do you have any advice to solve this problem?

JRMeyer commented 3 years ago

Are you using your own language model? it looks like the acoustic model isn't compatible with the language model

findmenowhere commented 3 years ago

I used this multitask model by changing the egs directly. I have some english corpus from native speakers and non-native speakers. The first task is simple asr. I directly changed the egs (LabelDim n -> LabelDim 2 and all the output labels) to recognize the native/non-native speakers. In the decoding part, I only want to check the first task asr. The language model I used is librispeech's test_tgsmall lm. I have tried comput_output.sh and the result seems very reasonable.

JRMeyer commented 3 years ago

hi @wjyiwjyi -- from what you've said, it seems like the issue is that the LM you've compiled isn't compatible with the language model... I'm not able to help you troubleshoot this unfortunately, but I think you should be able to get some help on kaldi-help or from the official docs.

findmenowhere commented 3 years ago

Thanks Josh. But I'm still confused what do you mean the LM I compiled and the language model? I think the code in run_nnet3_multitask only uses the language model to make graph?