daanzu / kaldi-active-grammar

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
GNU Affero General Public License v3.0
339 stars 51 forks source link

Other languages #21

Open daanzu opened 4 years ago

daanzu commented 4 years ago

In the future, it should be able to support a lot more, but the work is in finding decent models for other languages, and then some minor modifications to enable their use in KaldiAG.

https://github.com/dictation-toolbox/dragonfly/pull/241

daanzu commented 4 years ago

Please let me know if there is a specific language you are looking for, and I can see about using it for testing.

SwimmingTiger commented 3 years ago

I want to convert any of the following Chinese Mandarin models to compatible with KAG. Thanks for any help or documentation.

I have no experience with Kaldi. Currently the only one environment I can run is from kaldi-dragonfly-winpython37.zip. After getting the available models, I will develop my application with dragonfly and python.

And I know something about CMUSphinx. I tried Sphinx4 and found that it lacked some features I needed. So I switched to dragonfly/KAG. The English model in kaldi-dragonfly-winpython37.zip perfectly meets my needs, but my program needs to support more languages, especially Chinese.

daanzu commented 3 years ago

@SwimmingTiger It looks like the M11 model should be easiest, because it doesn't use pitch features, which I haven't implemented in this project. I will try to find time to try converting it to work with this, which should also give me a good opportunity to document the process clearly. However, basically the process is mostly just trying to get the various model files to be in the right place, as shown by how they are in the currently published models. There can be difficulty depending on which exact files have been published for the model you are trying to import.

SwimmingTiger commented 3 years ago

basically the process is mostly just trying to get the various model files to be in the right place, as shown by how they are in the currently published models.

So it only need to move the location of some files without converting the data content?

daanzu commented 3 years ago

@SwimmingTiger Correct, I think that may work, depending on exactly what files are included in your model package and what their structure is. However, it has been a while since I converted a fresh external model, and I haven't tried another language yet, so I am not sure. Please give it a try, and let me know how it goes.

SwimmingTiger commented 3 years ago

Today I finally tried to port http://kaldi-asr.org/models/m11 to KAG. Then I immediately encountered some unsolvable problems.

My actions (recorded as some Linux shell commands):

# rename the default model
mv kaldi_model kaldi_model.default

# rename http://kaldi-asr.org/models/m11
mv multi_cn_chain_sp_online kaldi_model

# KAG need this file
cp kaldi_model.default/KAG_VERSION kaldi_model/

Then I got two None errors here when I run the demo:

https://github.com/daanzu/kaldi-active-grammar/blob/869cd4616bee0b29f6cc7aa5c58e2a838c3103bb/kaldi_active_grammar/model.py#L326-L327

Try to bypass it:

# copy from the default model
cp kaldi_model.default/align_lexicon.int kaldi_model.default/align_lexicon.base.int kaldi_model.default/lexiconp_disambig.txt kaldi_model.default/lexiconp_disambig.base.txt kaldi_model/

Then I got this with the demo: kaldi_active_grammar.KaldiError: missing nonterms in 'phones.txt'.

https://github.com/daanzu/kaldi-active-grammar/blob/869cd4616bee0b29f6cc7aa5c58e2a838c3103bb/kaldi_active_grammar/model.py#L245

Try to bypass it:

cp kaldi_model.default/phones.nonterm.txt kaldi_model
cp kaldi_model/phones.txt kaldi_model/phones.base.txt
cat kaldi_model/phones.nonterm.txt >> kaldi_model/phones.txt

Got No such file or directory: 'kaldi_model\\words.base.txt'.

Bypass:

cp kaldi_model/words.txt kaldi_model/words.base.txt

Got kaldi_active_grammar.KaldiError: missing nonterms in 'words.base.txt'.

Bypass:

cp kaldi_model.default/words.nonterm.txt kaldi_model
cat kaldi_model/words.nonterm.txt >> kaldi_model/words.base.txt

Got:

Bypass:

cp kaldi_model.default/nonterminals.txt kaldi_model.default/left_context_phones.txt kaldi_model/

Got FATAL: FstCompiler: Symbol "b_B" is not mapped to any integer arc ilabel, symbol table = kaldi_model\phones.txt, source = <fst__compile_text>, line = 7

Unable to resolve. It seems that some files copied from the default model do not match the model http://kaldi-asr.org/models/m11. And I don't know how to generate those files:

align_lexicon.base.int
align_lexicon.int
left_context_phones.txt
lexiconp_disambig.base.txt
lexiconp_disambig.txt
nonterminals.txt

And I don't know what nonterm is and how to add it to the model http://kaldi-asr.org/models/m11.

I need some help. @daanzu

SwimmingTiger commented 3 years ago

I tried to compile https://github.com/kaldi-asr/kaldi, but I couldn't find a suitable tool to generate these files from it.

align_lexicon.base.int
align_lexicon.int
left_context_phones.txt
lexiconp_disambig.base.txt
lexiconp_disambig.txt
nonterminals.txt
daanzu commented 3 years ago

@SwimmingTiger You will definitely need to use the phone set from the new model rather than from the english model. You also need to use the lexicon from the new model. However, both will need to be modified slightly to match what was added to the english model. You should compare how my english model differs from a standard english model.