Open svenha opened 2 years ago
Probably there were earlier errors, you need to read the full log.
There are no visible errors, only some warnings:
--> SUCCESS [validating lang directory data/lang]
+ utils/format_lm.sh data/lang data/en-mix-small.lm.gz data/dict/lexicon.txt data/lang_test
Converting 'data/en-mix-small.lm.gz' to FST
arpa2fst --disambig-symbol=#0 --read-symbol-table=data/lang_test/words.txt - data/lang_test/G.fst
LOG (arpa2fst[5.5.1046~1-76cd5]:Read():arpa-file-parser.cc:94) Reading \data\ section.
LOG (arpa2fst[5.5.1046~1-76cd5]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
WARNING (arpa2fst[5.5.1046~1-76cd5]:Read():arpa-file-parser.cc:219) line 8 [-5.564116 'a -0.003993742] skipped: word ''a' not in symbol table
Many more, all about words with apostrophs. Can someone reproduce this? Just unpack and run ./compile-graph.sh inside.
Then it might be opengrm/ld_library_path issue, you need to run commands manually and check what is going on
OK. There is some confusion about opengrm versions. path.sh
of the update package expects version 1.3.10. Kaldi's installation script installs 1.3.12 and newer versions are around, too. Which version is known to work?
Which version is known to work?
Both should work fine
I have the same problem using kaldi with opengrm 1.3.12. Using kaldi from https://github.com/kaldi-asr/kaldi using opengrm 1.3.7 is running without problems
@kwiechen Thanks for the feedback. I understand that you can use the update package if you switch the kaldi AND the opengram version? What happens if you reduce the problem by changing only one of the two (opengram, kaldi)?
I have reinstalled kaldi from kaldi-master and opengrm 1.3.7 to solve this
Thanks @kwiechen . I followed your advice and it solved my problem. Unfortunately, I have no idea why the Vosk-way did not work :-( Just a guess: as I cannot compile opengrm 1.3.10 or newer (probably incompatible with openfst 1.7.2 that normal Kaldi installs), there might be a subtle version incompatibility.
We have this fix for the issue:
https://github.com/alphacep/kaldi/commit/11b67d387b547d1afb616ab8f95fd74c459d20c6
please check that your scripts are up-to-date
I have rebuilt zip package with required changes, please redownload if you have an old version. You need to have utils/relabel_words.py
I tried the update package for EN (vosk-model-en-us-0.22-compile.zip). Compared to version 0.21, which worked perfectly, this fails even if do not add any sentences or words.