daanzu / kaldi-active-grammar

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
GNU Affero General Public License v3.0
332 stars 49 forks source link

Parse lextool response and handle errors. #63

Closed bluecamel closed 2 years ago

bluecamel commented 2 years ago

All evening, lextool has been returning errors, but it wasn't initially clear what the issue was because the HTTP status code was 200. I'm not up to date on python 3, so this may be overkill and/or not latest best practice, so happy to change it if there's a better way.

bluecamel commented 2 years ago

Output with this PR:

INFO:engine:Loading grammar g1
DEBUG:kaldi.compiler:KaldiCompiler(): Compiling grammar g1.
DEBUG:kaldi.compiler:KaldiCompiler(): Compiling rule grammar manager grammar activator rule [EXPORTED].
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): www.speech.cs.cmu.edu:80
DEBUG:urllib3.connectionpool:http://www.speech.cs.cmu.edu:80 "POST /cgi-bin/tools/logios/lextool.pl HTTP/1.1" 200 321
ERROR:kaldi.model:generate_pronunciations exception accessing www.speech.cs.cmu.edu
Traceback (most recent call last):
  File "/home/bluecamel/.local/lib/python3.8/site-packages/kaldi_active_grammar/model.py", line 183, in generate_pronunciations
    raise LextoolError('  '.join(response_parser.error_text))
kaldi_active_grammar.model.LextoolError: You dictionary could not be constructed because could not upload word file  Please try again.
ERROR:kaldi.compiler:KaldiCompiler(): exception automatically adding word 'css'
Traceback (most recent call last):
  File "/home/bluecamel/.local/lib/python3.8/site-packages/dragonfly/engines/backend_kaldi/compiler.py", line 118, in handle_oov_word
    pronunciations = self.add_word(word, lazy_compilation=True)
  File "/home/bluecamel/.local/lib/python3.8/site-packages/kaldi_active_grammar/compiler.py", line 333, in add_word
    pronunciations = self.model.add_word(word, phones=phones, lazy_compilation=lazy_compilation)
  File "/home/bluecamel/.local/lib/python3.8/site-packages/kaldi_active_grammar/model.py", line 315, in add_word
    pronunciations = Lexicon.generate_pronunciations(word)
  File "/home/bluecamel/.local/lib/python3.8/site-packages/kaldi_active_grammar/model.py", line 205, in generate_pronunciations
    raise KaldiError("cannot generate word pronunciation")
kaldi_active_grammar.KaldiError: cannot generate word pronunciation
WARNING:kaldi.compiler:KaldiCompiler(): Word 'css' not in lexicon (will NOT be recognized; see documentation about user lexicon and auto_add_to_user_lexicon)
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): www.speech.cs.cmu.edu:80
DEBUG:urllib3.connectionpool:http://www.speech.cs.cmu.edu:80 "POST /cgi-bin/tools/logios/lextool.pl HTTP/1.1" 200 321
ERROR:kaldi.model:generate_pronunciations exception accessing www.speech.cs.cmu.edu
Traceback (most recent call last):
  File "/home/bluecamel/.local/lib/python3.8/site-packages/kaldi_active_grammar/model.py", line 183, in generate_pronunciations
    raise LextoolError('  '.join(response_parser.error_text))
kaldi_active_grammar.model.LextoolError: You dictionary could not be constructed because could not upload word file  Please try again.
ERROR:kaldi.compiler:KaldiCompiler(): exception automatically adding word 'css'
Traceback (most recent call last):
  File "/home/bluecamel/.local/lib/python3.8/site-packages/dragonfly/engines/backend_kaldi/compiler.py", line 118, in handle_oov_word
    pronunciations = self.add_word(word, lazy_compilation=True)
  File "/home/bluecamel/.local/lib/python3.8/site-packages/kaldi_active_grammar/compiler.py", line 333, in add_word
    pronunciations = self.model.add_word(word, phones=phones, lazy_compilation=lazy_compilation)
  File "/home/bluecamel/.local/lib/python3.8/site-packages/kaldi_active_grammar/model.py", line 315, in add_word
    pronunciations = Lexicon.generate_pronunciations(word)
  File "/home/bluecamel/.local/lib/python3.8/site-packages/kaldi_active_grammar/model.py", line 205, in generate_pronunciations
    raise KaldiError("cannot generate word pronunciation")
kaldi_active_grammar.KaldiError: cannot generate word pronunciation
WARNING:kaldi.compiler:KaldiCompiler(): Word 'css' not in lexicon (will NOT be recognized; see documentation about user lexicon and auto_add_to_user_lexicon)
daanzu commented 2 years ago

I think this should now be resolved with the reworked pronunciation generation.