Closed gopik closed 3 years ago
Once I prepare_lang.sh with nonterm:end as additional nonterminal, I got past the 2nd issue. But now the dictation.fst generation is crashing as follows (Note: The model was trained without nonterminals, if that's causing this. I'm taking an existing model and trying to convert to use active grammar framework)
VLOG[1] (compile-graph-agf[5.5.0~1-1f5a4]:main():compile-graph-agf.cc:237) Composing CLG fst... ERROR (compile-graph-agf[5.5.0~1-1f5a4]:main():compile-graph-agf.cc:245) Grammar-fst graph creation only supports models with left-biphone context. (--nonterm-phones-offset option was supplied).
[ Stack-Trace: ] 0 libkaldi-base.dylib 0x00000001038275bd kaldi::MessageLogger::LogMessage() const + 813 1 compile-graph-agf 0x0000000102fc9658 kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&) + 24 2 compile-graph-agf 0x0000000102fc851a main + 10698 3 libdyld.dylib 0x00007fff2036cf5d start + 1 4 ??? 0x000000000000000b 0x0 + 11
ERROR (compile-graph-agf[5.5.0~1-1f5a4]:main():compile-graph-agf.cc:310) Exception in compile-graph-agf
[ Stack-Trace: ] 0 libkaldi-base.dylib 0x00000001038275bd kaldi::MessageLogger::LogMessage() const + 813 1 compile-graph-agf 0x0000000102fc9658 kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&) + 24 2 compile-graph-agf 0x0000000102fc95a1 main + 14929 3 libdyld.dylib 0x00007fff2036cf5d start + 1 4 ??? 0x000000000000000b 0x0 + 11
libc++abi: terminating with uncaught exception of type kaldi::KaldiFatalError: kaldi::KaldiFatalError [1] 85686 abort --arcsort-grammar --nonterm-phones-offset=187 --simplify-lg=true (kaldi) ➜ kaldi /Users/gopik/opt/anaconda3/envs/kaldi/lib/python3.9/site-packages/kaldi_active_grammar/exec/macos/compile-graph-agf --arcsort-grammar --nonterm-phones-offset=187 --read-disambig-syms=agf_model/disambig.int --simplify-lg=true --verbose=20 agf_model/tree agf_model/final.mdl agf_model/L_disambig.fst agf_model/G.fst tmp/9a08772eb06c22a0d1f4aefe420b4883.fst
Ah, yes, the max disambiguation symbol is not currently adjustable by the library user, but there are only a few places in the code where this needs to be changed. @gopik Regarding the compilation error, do you know what the architecture of your model is? Unfortunately, we can only handle left-biphone context models currently. This includes the popular tdnn_f "chain" models.
I've used wsj/s5/local/chain/e2e/run_tdnn_flatstart.sh recipe to train the model. But when the model was trained, non terms were not there in lang. I was trying to reuse that model with AGF.
Also, when I copied the prepared lang with non terms into appropriate model_dir (L_disambig.fst), the disambig #14 issue wasn't there anymore. I was able to call compile successfully (with few warnings), but crashed with above error during init_decoder.
I am not familiar with the e2e kaldi training, but I think a chain model should work. You definitely don't need to train with the nonterms. Training with the same normal lexicon as I use does make things easier, but should not be necessary.
If the #14 disambiguation symbol issue disappeared, it makes me suspect something got mixed up, but that doesn't explain that particular error appearing.
Yes, it looks like it disappeared since it's not attempting to create those lex disambig fsts now (I created them using prepare_lang).
I realized that the crash message is red herring. The actual issue is decoder initialization failed due to some other missing config options. I realized that e2e training doesn't need ivector training/adaptation but AGF needs ivector configs. So I just copied configs from some other model.
After passing all ivector related confs, I'm now getting failure due to the following error -
KALDI severity=-2] AgfNNet3OnlineModelWrapper requires exactly one of top_fst and top_fst_filename
Full error trace -
kaldi.model (WARNING): model_dir has no version information; errors below may indicate an incompatible model kaldi.compiler (ERROR): cannot find dictation fst: agf_model/Dictation.fst ERROR ([5.5.0~1-1f5a4]:CompileGrammar():./compile-graph-agf.hh:254) Grammar-fst graph creation only supports models with left-biphone context. (--nonterm-phones-offset option was supplied).
[ Stack-Trace: ]
0 libkaldi-base.dylib 0x000000011c7e55bd kaldi::MessageLogger::LogMessage() const + 813
1 libkaldi-dragonfly.dylib 0x00000001110de988 bool dragonfly::BaseNNet3OnlineModelWrapper::Decode<kaldi::SingleUtteranceNnet3DecoderTpl<fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl
WARNING ([5.5.0~1-1f5a4]:nnet3_agfcompile_graph():agf-sub-nnet3.cc:425) Trying to survive fatal exception: kaldi::KaldiFatalError
[KALDI severity=-2] AgfNNet3OnlineModelWrapper requires exactly one of top_fst and top_fst_filename
[KALDI severity=-1] Trying to survive fatal exception: kaldi::KaldiFatalError
Traceback (most recent call last):
File "
Looks like it's failing to compile G.fst to dictation fst, hence top_fst/top_fst_filename not available.
Ah, yes, I forgot. You need to run python -m kaldi_active_grammar compile_agf_dictation_graph -v -m {{model_dir}} {{model_dir}}/G.fst
to generate the Dictation.fst
. The G.fst
can be from one of my models, or use any suitable language model.
Thanks @daanzu. The e2e recipe doesn't use biphones by default. That's the reason grammar fst related code was complaining. Ref: https://github.com/kaldi-asr/kaldi/blob/cafb8b315ae588cad0210655be539c6742e2e829/egs/wsj/s5/steps/nnet3/chain/e2e/prepare_e2e.sh#L19
Sorry didn't realize this earlier.
Error 1: FATAL: FstCompiler: Symbol "#14" is not mapped to any integer arc ilabel, symbol table = agf_model/phones.txt, source =, line = 4
This is some hardcoded disambiguation symbol but my model output has max #10 as disambig symbol.
Error 2: File "/Users/gopik/opt/anaconda3/envs/kaldi/lib/python3.9/site-packages/kaldi_active_grammar/compiler.py", line 505, in compile_top_fst return self._build_top_fst(nonterms=['#nonterm:rule'+str(i) for i in range(self._max_rule_id + 1)], noise_words=self._noise_words).compile() File "/Users/gopik/opt/anaconda3/envs/kaldi/lib/python3.9/site-packages/kaldi_active_grammar/compiler.py", line 519, in _build_top_fst fst.add_arc(state_return, state_final, None, '#nonterm:end') File "/Users/gopik/opt/anaconda3/envs/kaldi/lib/python3.9/site-packages/kaldi_active_grammar/wfst.py", line 275, in add_arc olabel_id = self.word_to_olabel_map[olabel] KeyError: '#nonterm:end
I see that nonterm_begin and nonterm_end available (notice underscore) but not with ":". Should I be adding these to nonterminals file?
Since I was unable to convert model using the script (since it's incomplete), I've prepared a lang using all the nonterminals required by this toolkit (like dictation, cloud_dictation, 1000 rule placeholders etc).
Any idea what I might be doing wrong?