Open hrddsky opened 1 year ago
You need to provide more information - command you are running and full output, not just the end. Also OS type and version
Hello Nickolay, Im doing it at official kaldi docker container "kaldiasr/kaldi", MacOS 13.2.1 with M1 with turned on Rosetta emulation of amd64. My way:
add extra.dic and extra.txt to db folder
They should be already there, I'm not sure how you add them again. Most likely you are doing something wrong.
Just replace it with filled ones with my data, trying it on Windows 10 PC at the same container - same result
Ok and why final.mdl file is missing then, it should be there
Hello, im newbie at VOSK. Trying to compile small model with extra.dic and extra.txt Process starts fine but after few checking just stops, here is a log:
bash compile-graph.sh
utils/prepare_lang.sh data/dict [unk] data/lang_local data/lang Checking data/dict/silence_phones.txt ... --> reading data/dict/silence_phones.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/silence_phones.txt is OK
Checking data/dict/optional_silence.txt ... --> reading data/dict/optional_silence.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/optional_silence.txt is OK
Checking data/dict/nonsilence_phones.txt ... --> reading data/dict/nonsilence_phones.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/nonsilence_phones.txt is OK
Checking disjoint: silence_phones.txt, nonsilence_phones.txt --> disjoint property is OK.
Checking data/dict/lexicon.txt --> reading data/dict/lexicon.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/lexicon.txt is OK
Checking data/dict/extra_questions.txt ... --> data/dict/extra_questions.txt is empty (this is OK) --> SUCCESS [validating dictionary directory data/dict]
**Creating data/dict/lexiconp.txt from data/dict/lexicon.txt /opt/kaldi/egs/wsj/s5/../../../src/fstbin/fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int prepare_lang.sh: validating output directory utils/validate_lang.pl data/lang Checking existence of separator file separator file data/lang/subword_separator.txt is empty or does not exist, deal in word case. Checking data/lang/phones.txt ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/lang/phones.txt is OK
Checking words.txt: #0 ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/lang/words.txt is OK
Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ... --> silence.txt and nonsilence.txt are disjoint --> silence.txt and disambig.txt are disjoint --> disambig.txt and nonsilence.txt are disjoint --> disjoint property is OK
Checking sumation: silence.txt, nonsilence.txt, disambig.txt ... --> found no unexplainable phones in phones.txt
Checking data/lang/phones/context_indep.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 10 entry/entries in data/lang/phones/context_indep.txt --> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt --> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt --> data/lang/phones/context_indep.{txt, int, csl} are OK
Checking data/lang/phones/nonsilence.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 192 entry/entries in data/lang/phones/nonsilence.txt --> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt --> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt --> data/lang/phones/nonsilence.{txt, int, csl} are OK
Checking data/lang/phones/silence.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 10 entry/entries in data/lang/phones/silence.txt --> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt --> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt --> data/lang/phones/silence.{txt, int, csl} are OK
Checking data/lang/phones/optional_silence.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 1 entry/entries in data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.{txt, int, csl} are OK
Checking data/lang/phones/disambig.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 5 entry/entries in data/lang/phones/disambig.txt --> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt --> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt --> data/lang/phones/disambig.{txt, int, csl} are OK
Checking data/lang/phones/roots.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 50 entry/entries in data/lang/phones/roots.txt --> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt --> data/lang/phones/roots.{txt, int} are OK
Checking data/lang/phones/sets.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 50 entry/entries in data/lang/phones/sets.txt --> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt --> data/lang/phones/sets.{txt, int} are OK
Checking data/lang/phones/extra_questions.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 9 entry/entries in data/lang/phones/extra_questions.txt --> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt --> data/lang/phones/extra_questions.{txt, int} are OK
Checking data/lang/phones/word_boundary.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 202 entry/entries in data/lang/phones/word_boundary.txt --> data/lang/phones/word_boundary.int corresponds to data/lang/phones/word_boundary.txt --> data/lang/phones/word_boundary.{txt, int} are OK
Checking optional_silence.txt ... --> reading data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.txt is OK
Checking disambiguation symbols: #0 and #1 --> data/lang/phones/disambig.txt has "#0" and "#1" --> data/lang/phones/disambig.txt is OK
Checking topo ...
Checking word_boundary.txt: silence.txt, nonsilence.txt, disambig.txt ... --> data/lang/phones/word_boundary.txt doesn't include disambiguation symbols --> data/lang/phones/word_boundary.txt is the union of nonsilence.txt and silence.txt --> data/lang/phones/word_boundary.txt is OK
Checking word-level disambiguation symbols... --> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh) Checking word_boundary.int and disambig.int --> generating a 96 word/subword sequence --> resulting phone sequence from L.fst corresponds to the word sequence --> L.fst is OK --> generating a 13 word/subword sequence --> resulting phone sequence from L_disambig.fst corresponds to the word sequence --> L_disambig.fst is OK
Checking data/lang/oov.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 1 entry/entries in data/lang/oov.txt --> data/lang/oov.int corresponds to data/lang/oov.txt --> data/lang/oov.{txt, int} are OK
--> data/lang/L.fst is olabel sorted --> data/lang/L_disambig.fst is olabel sorted --> SUCCESS [validating lang directory data/lang] utils/mkgraph_lookahead.sh : compiling grammar data/ru-mix.lm.gz utils/mkgraph_lookahead.sh : expected exp/tdnn/final.mdl to exist
What am I doing wrong? Thanks!