Graph compilation - Error missing file

alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Apache License 2.0

7.85k stars 1.09k forks source link

root@MAK:~/kaldi/tools/model/vosk-model-en-us-0.22-compile# compile-graph.sh + rm -rf 'data/*.lm.gz' data/lang_local data/dict data/lang data/lang_test data/lang_test_rescore + rm -rf exp/lgraph + rm -rf exp/graph + mkdir -p data/dict + cp db/phone/extra_questions.txt db/phone/extra_questions.txt:Zone.Identifier db/phone/nonsilence_phones.txt db/phone/nonsilence_phones.txt:Zone.Identifier db/phone/optional_silence.txt db/phone/optional_silence.txt:Zone.Identifier db/phone/silence_phones.txt db/phone/silence_phones.txt:Zone.Identifier data/dict + python3 ./dict.py + ngram-count -wbdiscount -order 4 -text db/extra.txt -lm data/extra.lm.gz /root/kaldi/tools/model/vosk-model-en-us-0.22-compile/compile-graph.sh: line 15: ngram-count: command not found + ngram -order 4 -lm db/en-230k-0.5.lm.gz -mix-lm data/extra.lm.gz -lambda 0.95 -write-lm data/en-mix.lm.gz /root/kaldi/tools/model/vosk-model-en-us-0.22-compile/compile-graph.sh: line 16: ngram: command not found + ngram -order 4 -lm data/en-mix.lm.gz -prune 3e-8 -write-lm data/en-mixp.lm.gz /root/kaldi/tools/model/vosk-model-en-us-0.22-compile/compile-graph.sh: line 17: ngram: command not found + ngram -lm data/en-mixp.lm.gz -write-lm data/en-mix-small.lm.gz /root/kaldi/tools/model/vosk-model-en-us-0.22-compile/compile-graph.sh: line 18: ngram: command not found + utils/prepare_lang.sh data/dict '[unk]' data/lang_local data/lang utils/prepare_lang.sh data/dict [unk] data/lang_local data/lang Checking data/dict/silence_phones.txt ... --> reading data/dict/silence_phones.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/silence_phones.txt is OK Checking data/dict/optional_silence.txt ... --> reading data/dict/optional_silence.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/optional_silence.txt is OK Checking data/dict/nonsilence_phones.txt ... --> reading data/dict/nonsilence_phones.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/nonsilence_phones.txt is OK Checking disjoint: silence_phones.txt, nonsilence_phones.txt --> disjoint property is OK. Checking data/dict/lexicon.txt --> reading data/dict/lexicon.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/lexicon.txt is OK Checking data/dict/extra_questions.txt ... --> reading data/dict/extra_questions.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/extra_questions.txt is OK --> SUCCESS [validating dictionary directory data/dict] **Creating data/dict/lexiconp.txt from data/dict/lexicon.txt utils/prepare_lang.sh: line 547: fstaddselfloops: command not found ERROR: FstHeader::Read: Bad FST header: standard input + utils/format_lm.sh data/lang data/en-mix-small.lm.gz data/dict/lexicon.txt data/lang_test Converting 'data/en-mix-small.lm.gz' to FST gzip: data/en-mix-small.lm.gz: No such file or directory utils/format_lm.sh: line 55: arpa2fst: command not found + utils/mkgraph.sh --self-loop-scale 1.0 data/lang_test exp/chain/tdnn exp/chain/tdnn/graph mkgraph.sh: expected data/lang_test/G.fst to exist + utils/build_const_arpa_lm.sh data/en-mix.lm.gz data/lang_test data/lang_test_rescore utils/build_const_arpa_lm.sh: line 45: arpa-to-const-arpa: command not found + rnnlm/change_vocab.sh data/lang/words.txt exp/rnnlm exp/rnnlm_out rnnlm/change_vocab.sh: Copying config directory. rnnlm/change_vocab.sh: Re-generating words.txt, unigram_probs.txt, word_feats.txt and word_embedding.final.mat. rnnlm/get_word_features.py: made features for 312336 words. rnnlm/change_vocab.sh: line 75: rnnlm-get-word-embedding: command not found + utils/mkgraph_lookahead.sh --self-loop-scale 1.0 data/lang exp/chain/tdnn data/en-mix-small.lm.gz exp/chain/tdnn/lgraph utils/mkgraph_lookahead.sh : compiling grammar data/en-mix-small.lm.gz utils/mkgraph_lookahead.sh : expected data/en-mix-small.lm.gz to exist

Graph compilation For performance all the models are compiled into more compact structures - FST graphs. If you want to modify them - add new words or adapt to a domain, you run several steps of graph compilation. Not every Vosk model allows vocabulary modification of the graph. Some like US English, big Russian or German include all necessary files (“tree” file from the model which contains information about phoneme context dependency). Some don’t have required files, you need to contact Alphacephei to get access to them. Hardware Compilation is not very slow, but still requires significant hardware - a Linux server with 32Gb RAM at least and 100Gb of disk space. It is unlikely you can compile a big model in a virtual machine. Small models require less data. Software The following software must be pre-installed on a server: Kaldi SRILM Phonetisaurus (with pip3 install phonetisaurus) In the future we might provide a docker for model compilation, for now you have to compile it yourself. Update process Download the update package, for example: Russian - https://alphacephei.com/vosk/models/vosk-model-ru-0.22-compile.zip US English - https://alphacephei.com/vosk/models/vosk-model-en-us-0.22-compile.zip German - https://alphacephei.com/vosk/models/vosk-model-de-0.21-compile.zip French - https://alphacephei.com/vosk/models/vosk-model-fr-0.6-linto-2.2.0-compile.zip Other language packs are available on request. Please contact us at [contact@alphacephei.com](mailto:contact@alphacephei.com) Unpack and properly point to KALDI_ROOT in the path.sh script Add your extra texts into db/extra.txt Optionally add manual words phones into db/extra.dic Run compile-graph.sh. Update takes about 15 minutes. Watch errors in the process. Run decode.sh to test decoding works successfully. Watch the WER in the decoding folder. Optionally, check that the g2p properly predicted the phonemes in the end of data/dict/lexicon.txt. If needed, update g2p model with new words. Outputs Depending on your needs you might pick some result files from the compilation folder. Remember, that if you changed the graph you also need to change the rescoring/rnnlm part, otherwise they will go out of sync and accuracy will be low. For large model pick the following parts: exp/chain/tdnn/graph data/lang_test_rescore/G.fst and data/lang_test_rescore/G.carpa into rescore folder exp/rnnlm_out into rnnlm folder, you can delete some unnecessary files from rnnlm too. If you don’t want to use RNNLM, delete rnnlm folder from the model. If you don’t want to use rescoring, delete the rescore folder from the model, that will save you some runtime memory, but accuracy will be lower. For small model, just pick the required files from exp/chain/tdnn/lgraph.

Hello, When I run ./compile-graph.sh it ended as below, is this normal?

./compile-graph.sh: line 15: ngram-count: command not found ./compile-graph.sh: line 16: ngram: command not found utils/prepare_lang.sh data/dict [unk] data/lang_local data/lang Checking data/dict/silence_phones.txt ... --> reading data/dict/silence_phones.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/silence_phones.txt is OK

Checking data/dict/optional_silence.txt ... --> reading data/dict/optional_silence.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/optional_silence.txt is OK

Checking data/dict/nonsilence_phones.txt ... --> reading data/dict/nonsilence_phones.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/nonsilence_phones.txt is OK

Checking disjoint: silence_phones.txt, nonsilence_phones.txt --> disjoint property is OK.

Checking data/dict/lexicon.txt --> reading data/dict/lexicon.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/lexicon.txt is OK

Checking data/dict/extra_questions.txt ... --> data/dict/extra_questions.txt is empty (this is OK) --> SUCCESS [validating dictionary directory data/dict]

**Creating data/dict/lexiconp.txt from data/dict/lexicon.txt fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int prepare_lang.sh: validating output directory utils/validate_lang.pl data/lang Checking existence of separator file separator file data/lang/subword_separator.txt is empty or does not exist, deal in word case. Checking data/lang/phones.txt ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/lang/phones.txt is OK

Checking words.txt: #0 ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/lang/words.txt is OK

Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ... --> silence.txt and nonsilence.txt are disjoint --> silence.txt and disambig.txt are disjoint --> disambig.txt and nonsilence.txt are disjoint --> disjoint property is OK

Checking sumation: silence.txt, nonsilence.txt, disambig.txt ... --> found no unexplainable phones in phones.txt

Checking data/lang/phones/context_indep.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 10 entry/entries in data/lang/phones/context_indep.txt --> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt --> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt --> data/lang/phones/context_indep.{txt, int, csl} are OK

Checking data/lang/phones/nonsilence.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 116 entry/entries in data/lang/phones/nonsilence.txt --> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt --> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt --> data/lang/phones/nonsilence.{txt, int, csl} are OK

Checking data/lang/phones/silence.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 10 entry/entries in data/lang/phones/silence.txt --> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt --> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt --> data/lang/phones/silence.{txt, int, csl} are OK

Checking data/lang/phones/optional_silence.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 1 entry/entries in data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.{txt, int, csl} are OK

Checking data/lang/phones/disambig.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 7 entry/entries in data/lang/phones/disambig.txt --> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt --> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt --> data/lang/phones/disambig.{txt, int, csl} are OK

Checking data/lang/phones/roots.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 31 entry/entries in data/lang/phones/roots.txt --> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt --> data/lang/phones/roots.{txt, int} are OK

Checking data/lang/phones/sets.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 31 entry/entries in data/lang/phones/sets.txt --> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt --> data/lang/phones/sets.{txt, int} are OK

Checking data/lang/phones/extra_questions.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 9 entry/entries in data/lang/phones/extra_questions.txt --> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt --> data/lang/phones/extra_questions.{txt, int} are OK

Checking data/lang/phones/word_boundary.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 126 entry/entries in data/lang/phones/word_boundary.txt --> data/lang/phones/word_boundary.int corresponds to data/lang/phones/word_boundary.txt --> data/lang/phones/word_boundary.{txt, int} are OK

Checking optional_silence.txt ... --> reading data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.txt is OK

Checking disambiguation symbols: #0 and #1 --> data/lang/phones/disambig.txt has "#0" and "#1" --> data/lang/phones/disambig.txt is OK

Checking topo ...

Checking word_boundary.txt: silence.txt, nonsilence.txt, disambig.txt ... --> data/lang/phones/word_boundary.txt doesn't include disambiguation symbols --> data/lang/phones/word_boundary.txt is the union of nonsilence.txt and silence.txt --> data/lang/phones/word_boundary.txt is OK

Checking word-level disambiguation symbols... --> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh) Checking word_boundary.int and disambig.int --> generating a 85 word/subword sequence --> resulting phone sequence from L.fst corresponds to the word sequence --> L.fst is OK --> generating a 10 word/subword sequence --> resulting phone sequence from L_disambig.fst corresponds to the word sequence --> L_disambig.fst is OK

Checking data/lang/oov.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 1 entry/entries in data/lang/oov.txt --> data/lang/oov.int corresponds to data/lang/oov.txt --> data/lang/oov.{txt, int} are OK

--> data/lang/L.fst is olabel sorted --> data/lang/L_disambig.fst is olabel sorted --> SUCCESS [validating lang directory data/lang] utils/mkgraph_lookahead.sh : compiling grammar data/tr-mix.lm.gz utils/mkgraph_lookahead.sh : expected data/tr-mix.lm.gz to exist

alphacep / vosk-api

Graph compilation - Error missing file #1148