rhasspy / wyoming-vosk

Wyoming protocol server for the vosk speech to text system
MIT License
10 stars 5 forks source link

vosk add-on does not start if a file with sentence templates is added. #6

Closed mitrokun closed 7 months ago

mitrokun commented 7 months ago

HAOS 2024.2.5

The addon works without using templates. Only the language setting has been adjusted.

But if I add ru.yaml, then a cyclic error occurs at startup.

image

sentences:
  - включи[ть] {lght}
  - выключи[ть] {lght}
  - уровень диоксида углерода
  - (включи|установи|запусти)[ть] {preset} плейлист
  - (включи|установи|запусти)[ть] плейлист [номер] {preset}
lists:
  lght:
    values:
      - фонарь
      - квадрат
      - торшер
      - зеркало
      - дневной свет
      - луч
  preset:
    values:
      - in: (один|первый)
        out: 1
      - in: (два|второй)
        out: 2
      - in: (три|третий)
        out: 3
      - in: (четыре|четвертый)
        out: 4
      - in: (пять|пятый)
        out: 5
      - in: (шесть|шествой)
        out: 6
      - in: (семь|седьмой)
        out: 7
      - in: (восемь|восьмой)
        out: 8
      - in: (девять|девятый)
        out: 9
      - in: (десять|десятый)
        out: 10
LOG (VoskAPI:ReadDataFiles():model.cc:213) Decoding params beam=10 max-active=3000 lattice-beam=2
LOG (VoskAPI:ReadDataFiles():model.cc:216) Silence phones 1:2:3:4:5:6:7:8:9:10
LOG (VoskAPI:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes.
LOG (VoskAPI:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components.
LOG (VoskAPI:ReadDataFiles():model.cc:248) Loading i-vector extractor from /data/vosk-model-small-ru-0.22/ivector/final.ie
LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (VoskAPI:ReadDataFiles():model.cc:282) Loading HCL and G from /data/vosk-model-small-ru-0.22/graph/HCLr.fst /data/vosk-model-small-ru-0.22/graph/Gr.fst
LOG (VoskAPI:ReadDataFiles():model.cc:308) Loading winfo /data/vosk-model-small-ru-0.22/graph/phones/word_boundary.int
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/usr/local/lib/python3.11/dist-packages/wyoming_vosk/__main__.py", line 403, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/wyoming_vosk/__main__.py", line 226, in main
    load_sentences_for_language(args.sentences_dir, language, args.database_dir)
  File "/usr/local/lib/python3.11/dist-packages/wyoming_vosk/sentences.py", line 61, in load_sentences_for_language
    sentences_yaml = yaml.safe_load(sentences_file)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/yaml/__init__.py", line 125, in safe_load
    return load(stream, SafeLoader)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/yaml/__init__.py", line 79, in load
    loader = Loader(stream)
             ^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/yaml/loader.py", line 34, in __init__
    Reader.__init__(self, stream)
  File "/usr/local/lib/python3.11/dist-packages/yaml/reader.py", line 85, in __init__
    self.determine_encoding()
  File "/usr/local/lib/python3.11/dist-packages/yaml/reader.py", line 124, in determine_encoding
    self.update_raw()
  File "/usr/local/lib/python3.11/dist-packages/yaml/reader.py", line 178, in update_raw
    data = self.stream.read(size)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 15: invalid continuation byte
[22:15:33] INFO: Service exited with code 1 (by signal 0)

What could be the cause?

synesthesiam commented 7 months ago

It looks like the file encoding isn't quite right. The add-on expects the file to be UTF-8 encoded, but Windows will almost screw this up and use a different encoding. You may need something like Notepad++ to do this correctly.

mitrokun commented 7 months ago

Thanks for solving the problem. Indeed, windows notepad uses ANSI encoding according to the standard. But you can also choose utf-8. Now everything has started without any problems.