alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
8.06k stars 1.11k forks source link

Properly escape lang for regex #1242

Open tobias-silva opened 1 year ago

tobias-silva commented 1 year ago

I'm trying to load a language model from a downloaded file (provided by vosk at https://alphacephei.com/vosk/models/vosk-model-small-pt-0.3.zip). I've already downloaded, unzipped in a local folder. C:\Users\user.here\PycharmProjects\ASR_Custom\models\pt-small

and tried to load with

path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "models", "pt-small")
model = Model(lang=path)

and I'm receiving this error:

Traceback (most recent call last):
  File "C:\Users\user.here\PycharmProjects\ASR_Custom\main.py", line 14, in <module>
    model = Model(lang="C:\\Users\\user.here\.cache\\vosk\\vosk-model-small-pt-0.3")
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user.here\PycharmProjects\ASR_Custom\venv\Lib\site-packages\vosk\__init__.py", line 54, in __init__
    model_path = self.get_model_path(model_name, lang)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user.here\PycharmProjects\ASR_Custom\venv\Lib\site-packages\vosk\__init__.py", line 67, in get_model_path
    model_path = self.get_model_by_lang(lang)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user.here\PycharmProjects\ASR_Custom\venv\Lib\site-packages\vosk\__init__.py", line 94, in get_model_by_lang
    model_file = [model for model in model_file_list if
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user.here\PycharmProjects\ASR_Custom\venv\Lib\site-packages\vosk\__init__.py", line 95, in <listcomp>
    match(r"vosk-model(-small)?-{}".format(lang), model)]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files (x86)\Python311-32\Lib\re\__init__.py", line 166, in match
    return _compile(pattern, flags).match(string)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files (x86)\Python311-32\Lib\re\__init__.py", line 294, in _compile
    p = _compiler.compile(pattern, flags)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files (x86)\Python311-32\Lib\re\_compiler.py", line 743, in compile
    p = _parser.parse(p, flags)
        ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files (x86)\Python311-32\Lib\re\_parser.py", line 980, in parse
    p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files (x86)\Python311-32\Lib\re\_parser.py", line 455, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files (x86)\Python311-32\Lib\re\_parser.py", line 539, in _parse
    code = _escape(source, this, state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files (x86)\Python311-32\Lib\re\_parser.py", line 393, in _escape
    raise source.error("incomplete escape %s" % escape, len(escape))
re.error: incomplete escape \U at position 22
Exception ignored in: <function Model.__del__ at 0x03568208>
Traceback (most recent call last):
  File "C:\Users\user.here\PycharmProjects\ASR_Custom\venv\Lib\site-packages\vosk\__init__.py", line 60, in __del__
    _c.vosk_model_free(self._handle)
                       ^^^^^^^^^^^^
AttributeError: 'Model' object has no attribute '_handle'

Is that a bug or I'm doing something wrong?

tobias-silva commented 1 year ago

I forgot, the version is 0.3.44, installed from pip, in python 3.11

nshmyrev commented 1 year ago

You are trying to load model by language with lang= which should be just "pt" if you want to load by path it's model_path= or by name with model_name=

tobias-silva commented 1 year ago

Thank you very much, sorry for the silly mistake.

nshmyrev commented 1 year ago

We need to fix regex to make messages more meaningful