Open danpovey opened 8 years ago
Pasting an email thread that is relevant to this:
In the documentation is written:
"Notice that we allow words with empty phonetic representations." [1]
But in validate_dict_dir.pl:
"--> ERROR: lexicon.txt contains word $word with empty pronunciation" [2]
Why does the code not allow for empty pronunciations? Is there a problem with using empty pronunciations at test-time?
Peter
[1] http://kaldi-asr.org/doc/graph_recipe_test.html [2] https://github.com/kaldi-asr/kaldi/blob/master/egs/wsj/s5/utils/validate_dict_dir.pl#L201-L205
Daniel Povey dpovey@gmail.com Oct 23 (1 day ago)
to kaldi-help The documentation is out of date-- that statement should not be there, and the BOS and EOS symbols should not be present in the example. If words with empty prons are present in the lexicon, it makes it harder to guarantee that the determinization of the decoding graph will succeed.
@sih4sing5hong5 please note..
I will read them recently.
The path in example codes are outdated.
I will update them with data/local/dict
and data/lang
instead of data
.
I am not familiar in fat
.
So I need some time to understand actually what the commands do.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed by a bot strictly because of inactivity. This does not mean that we think that this issue is not important! If you believe it has been closed hastily, add a comment to the issue and mention @kkm000, and I'll gladly reopen it.
The docs http://kaldi-asr.org/doc/graph_recipe_test.html#graph_lexicon seem to be out of date, as pointed out by Remi Francis. E.g. it says "Notice that we allow words with empty phonetic representations."; and the filenames seem to correspond to an older recipe.