Open makdatascientist opened 2 years ago
It says you miss srilm ngram tool in PATH
and other kaldi binaries too, probably kaldi is not compiled
Thanks a lot for your reply, really appreciated. For getting model compilation package only the below step are enough? kindly reply.
Graph compilation
For performance all the models are compiled into more compact structures - FST graphs. If you want to modify them - add new words or adapt to a domain, you run several steps of graph compilation.
Not every Vosk model allows vocabulary modification of the graph. Some like US English, big Russian or German include all necessary files (“tree” file from the model which contains information about phoneme context dependency). Some don’t have required files, you need to contact Alphacephei to get access to them.
Hardware
Compilation is not very slow, but still requires significant hardware - a Linux server with 32Gb RAM at least and 100Gb of disk space. It is unlikely you can compile a big model in a virtual machine. Small models require less data.
Software
The following software must be pre-installed on a server:
Kaldi
SRILM
Phonetisaurus (with pip3 install phonetisaurus)
In the future we might provide a docker for model compilation, for now you have to compile it yourself.
Update process
Download the update package, for example:
Russian - https://alphacephei.com/vosk/models/vosk-model-ru-0.22-compile.zip
US English - https://alphacephei.com/vosk/models/vosk-model-en-us-0.22-compile.zip
German - https://alphacephei.com/vosk/models/vosk-model-de-0.21-compile.zip
French - https://alphacephei.com/vosk/models/vosk-model-fr-0.6-linto-2.2.0-compile.zip
Other language packs are available on request. Please contact us at [contact@alphacephei.com](mailto:contact@alphacephei.com)
Unpack and properly point to KALDI_ROOT in the path.sh script
Add your extra texts into db/extra.txt
Optionally add manual words phones into db/extra.dic
Run compile-graph.sh. Update takes about 15 minutes. Watch errors in the process.
Run decode.sh to test decoding works successfully. Watch the WER in the decoding folder.
Optionally, check that the g2p properly predicted the phonemes in the end of data/dict/lexicon.txt. If needed, update g2p model with new words.
Outputs
Depending on your needs you might pick some result files from the compilation folder. Remember, that if you changed the graph you also need to change the rescoring/rnnlm part, otherwise they will go out of sync and accuracy will be low.
For large model pick the following parts:
exp/chain/tdnn/graph
data/lang_test_rescore/G.fst and data/lang_test_rescore/G.carpa into rescore folder
exp/rnnlm_out into rnnlm folder, you can delete some unnecessary files from rnnlm too.
If you don’t want to use RNNLM, delete rnnlm folder from the model.
If you don’t want to use rescoring, delete the rescore folder from the model, that will save you some runtime memory, but accuracy will be lower.
For small model, just pick the required files from exp/chain/tdnn/lgraph.
Hello, When I run ./compile-graph.sh it ended as below, is this normal?
./compile-graph.sh: line 15: ngram-count: command not found ./compile-graph.sh: line 16: ngram: command not found utils/prepare_lang.sh data/dict [unk] data/lang_local data/lang Checking data/dict/silence_phones.txt ... --> reading data/dict/silence_phones.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/silence_phones.txt is OK
Checking data/dict/optional_silence.txt ... --> reading data/dict/optional_silence.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/optional_silence.txt is OK
Checking data/dict/nonsilence_phones.txt ... --> reading data/dict/nonsilence_phones.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/nonsilence_phones.txt is OK
Checking disjoint: silence_phones.txt, nonsilence_phones.txt --> disjoint property is OK.
Checking data/dict/lexicon.txt --> reading data/dict/lexicon.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/lexicon.txt is OK
Checking data/dict/extra_questions.txt ... --> data/dict/extra_questions.txt is empty (this is OK) --> SUCCESS [validating dictionary directory data/dict]
**Creating data/dict/lexiconp.txt from data/dict/lexicon.txt fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int prepare_lang.sh: validating output directory utils/validate_lang.pl data/lang Checking existence of separator file separator file data/lang/subword_separator.txt is empty or does not exist, deal in word case. Checking data/lang/phones.txt ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/lang/phones.txt is OK
Checking words.txt: #0 ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/lang/words.txt is OK
Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ... --> silence.txt and nonsilence.txt are disjoint --> silence.txt and disambig.txt are disjoint --> disambig.txt and nonsilence.txt are disjoint --> disjoint property is OK
Checking sumation: silence.txt, nonsilence.txt, disambig.txt ... --> found no unexplainable phones in phones.txt
Checking data/lang/phones/context_indep.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 10 entry/entries in data/lang/phones/context_indep.txt --> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt --> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt --> data/lang/phones/context_indep.{txt, int, csl} are OK
Checking data/lang/phones/nonsilence.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 116 entry/entries in data/lang/phones/nonsilence.txt --> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt --> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt --> data/lang/phones/nonsilence.{txt, int, csl} are OK
Checking data/lang/phones/silence.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 10 entry/entries in data/lang/phones/silence.txt --> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt --> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt --> data/lang/phones/silence.{txt, int, csl} are OK
Checking data/lang/phones/optional_silence.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 1 entry/entries in data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.{txt, int, csl} are OK
Checking data/lang/phones/disambig.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 7 entry/entries in data/lang/phones/disambig.txt --> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt --> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt --> data/lang/phones/disambig.{txt, int, csl} are OK
Checking data/lang/phones/roots.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 31 entry/entries in data/lang/phones/roots.txt --> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt --> data/lang/phones/roots.{txt, int} are OK
Checking data/lang/phones/sets.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 31 entry/entries in data/lang/phones/sets.txt --> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt --> data/lang/phones/sets.{txt, int} are OK
Checking data/lang/phones/extra_questions.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 9 entry/entries in data/lang/phones/extra_questions.txt --> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt --> data/lang/phones/extra_questions.{txt, int} are OK
Checking data/lang/phones/word_boundary.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 126 entry/entries in data/lang/phones/word_boundary.txt --> data/lang/phones/word_boundary.int corresponds to data/lang/phones/word_boundary.txt --> data/lang/phones/word_boundary.{txt, int} are OK
Checking optional_silence.txt ... --> reading data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.txt is OK
Checking disambiguation symbols: #0 and #1 --> data/lang/phones/disambig.txt has "#0" and "#1" --> data/lang/phones/disambig.txt is OK
Checking topo ...
Checking word_boundary.txt: silence.txt, nonsilence.txt, disambig.txt ... --> data/lang/phones/word_boundary.txt doesn't include disambiguation symbols --> data/lang/phones/word_boundary.txt is the union of nonsilence.txt and silence.txt --> data/lang/phones/word_boundary.txt is OK
Checking word-level disambiguation symbols... --> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh) Checking word_boundary.int and disambig.int --> generating a 85 word/subword sequence --> resulting phone sequence from L.fst corresponds to the word sequence --> L.fst is OK --> generating a 10 word/subword sequence --> resulting phone sequence from L_disambig.fst corresponds to the word sequence --> L_disambig.fst is OK
Checking data/lang/oov.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 1 entry/entries in data/lang/oov.txt --> data/lang/oov.int corresponds to data/lang/oov.txt --> data/lang/oov.{txt, int} are OK
--> data/lang/L.fst is olabel sorted --> data/lang/L_disambig.fst is olabel sorted --> SUCCESS [validating lang directory data/lang] utils/mkgraph_lookahead.sh : compiling grammar data/tr-mix.lm.gz utils/mkgraph_lookahead.sh : expected data/tr-mix.lm.gz to exist
Hi, I have downloaded the model compile file from link https://alphacephei.com/vosk/models/vosk-model-en-us-0.22-compile.zip and while running the
compile-graph.sh
to compile graph in kaldi, I got a following error like belowkindly suggest where to find
en-mix-small.lm.gz
file.