daanzu / kaldi-active-grammar

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
GNU Affero General Public License v3.0
332 stars 49 forks source link

kaldi_active_grammar.KaldiError: cannot generate word pronunciation #62

Closed Timoses closed 2 years ago

Timoses commented 2 years ago

On fresh Fedora34 install using Dragonfly and Kaldi I get:

WARNING:kaldi.compiler:KaldiCompiler(): Word 'brov' not in lexicon (will NOT be recognized; see documentation about user lexicon and auto_add_to_user_lexicon)
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): www.speech.cs.cmu.edu:80
DEBUG:urllib3.connectionpool:http://www.speech.cs.cmu.edu:80 "POST /cgi-bin/tools/logios/lextool.pl HTTP/1.1" 200 321
ERROR:kaldi.compiler:KaldiCompiler(): exception automatically adding word 'novakeen'
Traceback (most recent call last):
  File "/opt/code/voice/venv/lib/python3.9/site-packages/dragonfly/engines/backend_kaldi/compiler.py", line 118, in handle_oov_word
    pronunciations = self.add_word(word, lazy_compilation=True)
  File "/opt/code/voice/venv/lib64/python3.9/site-packages/kaldi_active_grammar/compiler.py", line 333, in add_word
    pronunciations = self.model.add_word(word, phones=phones, lazy_compilation=lazy_compilation)
  File "/opt/code/voice/venv/lib64/python3.9/site-packages/kaldi_active_grammar/model.py", line 269, in add_word
    pronunciations = Lexicon.generate_pronunciations(word)
  File "/opt/code/voice/venv/lib64/python3.9/site-packages/kaldi_active_grammar/model.py", line 159, in generate_pronunciations
    raise KaldiError("cannot generate word pronunciation")
kaldi_active_grammar.KaldiError: cannot generate word pronunciation

Any idea what could be wrong?

bluecamel commented 2 years ago

I've been hitting this all evening. The problem is that the CMU lextool seems to be broken and returning errors (but with an okay HTTP status). I made a PR that at least shows the error from lextool, but it doesn't fix the real issue.

daanzu commented 2 years ago

Well, that's annoying. As a temporary measure, you should be able to install g2p_en==2.0.0 to be able to run locally, but it is kind of a pain. I will look for a better solution.

daanzu commented 2 years ago

This should be resolved in multiple ways with the pronunciation generation changes in the v3.0.0 release.

Timoses commented 1 year ago

Oddly, I get the same error again after a fresh install on MacOS this time (with 3.1.0 installed):

ERROR:kaldi.compiler:KaldiCompiler(): exception automatically adding word 'prev'
Traceback (most recent call last):
  File "/Users/Timoses/code/voice/venv_caster/lib/python3.10/site-packages/dragonfly/engines/backend_kaldi/compiler.py", line 119, in handle_oov_word
    pronunciations = self.add_word(word, lazy_compilation=True, allow_online_pronunciations=self.allow_online_pronunciations)
  File "/Users/Timoses/code/voice/venv_caster/lib/python3.10/site-packages/kaldi_active_grammar/compiler.py", line 334, in add_word
    pronunciations = self.model.add_word(word, phones=phones, lazy_compilation=lazy_compilation, allow_online_pronunciations=allow_online_pronunciations)
  File "/Users/Timoses/code/voice/venv_caster/lib/python3.10/site-packages/kaldi_active_grammar/model.py", line 286, in add_word
    pronunciations = Lexicon.generate_pronunciations(word, model_dir=self.model_dir, allow_online_pronunciations=allow_online_pronunciations)
  File "/Users/Timoses/code/voice/venv_caster/lib/python3.10/site-packages/kaldi_active_grammar/model.py", line 176, in generate_pronunciations
    raise KaldiError("cannot generate word pronunciation: no generators available")
kaldi_active_grammar.KaldiError: cannot generate word pronunciation: no generators available
WARNING:kaldi.compiler:KaldiCompiler(): Word 'prev' not in lexicon (will NOT be recognized; see documentation about user lexicon and auto_add_to_user_lexicon)
LexiconCode commented 1 year ago

Oddly, I get the same error again after a fresh install on MacOS this time (with 3.1.0 installed):

Try to install pip install g2p-en

Timoses commented 1 year ago

Oddly, I get the same error again after a fresh install on MacOS this time (with 3.1.0 installed):

Try to install pip install g2p-en

Did and it works.