synesthesiam / voice2json

Command-line tools for speech and intent recognition on Linux
MIT License
1.09k stars 63 forks source link

training for German Kaldi fails #4

Closed johanneskropf closed 4 years ago

johanneskropf commented 4 years ago

When I use the voise2json train-profile I get this error:

-- grammars
-- grammar_dependencies:Wetter_dependencies
-- grammar_fsts:Wetter_fst
-- intent_fst
-- language_model:intent_counts
-- language_model:intent_model
-- language_model:intent_arpa
-- vocab
.  vocab_dict
TaskError - taskid:vocab_dict
PythonAction Error
Traceback (most recent call last):
  File "doit/action.py", line 424, in execute
  File "voice2json/train/__init__.py", line 446, in do_dict
  File "voice2json/train/vocab_dict.py", line 49, in make_dict
  File "voice2json/utils.py", line 157, in read_dict
  File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 439: ordinal not in range(128)

########################################
vocab_dict <stdout>:

Its on a raspberrypi 4 and the sentences ini only includes this:

[Wetter]
wie ist das wetter
wie wird das wetter
wie (ist | wird) das wetter (heute | morgen)

This only happens with the german kaldi profile. The english profile and the german pocketsphinx python one work fine. Any ideas what I could do to solve this?

Johannes

johanneskropf commented 4 years ago

Formatting issue on my side for some reason.