elan-ev / vosk-cli

Apache License 2.0
2 stars 9 forks source link

More robust model location autodetection? #26

Open will-ca opened 1 year ago

will-ca commented 1 year ago

https://github.com/elan-ev/vosk-cli/blob/81b9f1ca27efc9ea3fa316be620b41a69edd6b4c/voskcli/transcribe.py#L249-L252

https://github.com/elan-ev/vosk-cli/blob/81b9f1ca27efc9ea3fa316be620b41a69edd6b4c/voskcli/transcribe.py#L337-L340

Searching in $XDG_DATA_DIRS would be nice. (Spec.) You don't always want to install models system-wide.

As it's not always set, it should also look through default values of /usr/share/ (as currently), /usr/local/share/, and $HOME/.local/share/.

In addition to exposing manually installed models in non-root locations, this would also allow automatic use of models installed via E.G. Flatpak or Nix, which set $XDG_DATA_DIRS.

Also, from the AUR, the share subdirectory seems to be vosk-models, not vosk/models:

$ pacman -Ql vosk-api-bin
vosk-api-bin /usr/
vosk-api-bin /usr/include/
vosk-api-bin /usr/include/vosk_api.h
vosk-api-bin /usr/lib/
vosk-api-bin /usr/lib/libvosk.so
vosk-api-bin /usr/local/
vosk-api-bin /usr/local/share/
vosk-api-bin /usr/local/share/vosk-models/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/README
vosk-api-bin /usr/local/share/vosk-models/small-en-us/am/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/am/final.mdl
vosk-api-bin /usr/local/share/vosk-models/small-en-us/conf/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/conf/mfcc.conf
vosk-api-bin /usr/local/share/vosk-models/small-en-us/conf/model.conf
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/Gr.fst
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/HCLr.fst
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/disambig_tid.int
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/phones/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/phones/word_boundary.int
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/final.dubm
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/final.ie
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/final.mat
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/global_cmvn.stats
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/online_cmvn.conf
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/splice.conf

But that may be a packaging issue in the vosk-api-bin package, as vosk-api does use vosk/models.

Just a thought/convenience that might be nice to have.

will-ca commented 1 year ago
  • What is the canonical place to put models? Do the docs give a recommendation?

nshmyrev commented on Apr 29, 2022 Like I wrote before we need to look for models in 3 places: /usr/share/vosk, /home/user/.cache/vosk, os.env('VOSK_MODEL_PATH').

MODEL_DIRS = [os.getenv("VOSK_MODEL_PATH"), Path("/usr/share/vosk"),
        Path.home() / "AppData/Local/vosk", Path.home() / ".cache/vosk"]

https://github.com/alphacep/vosk-api/blob/4c720974788cfe3b985b2bf228899cba265afde4/python/vosk/__init__.py#L18-L19

    def get_model_by_name(self, model_name):
        for directory in MODEL_DIRS:

https://github.com/alphacep/vosk-api/blob/4c720974788cfe3b985b2bf228899cba265afde4/python/vosk/__init__.py#L72-L73

    def get_model_by_lang(self, lang):
        for directory in MODEL_DIRS:

https://github.com/alphacep/vosk-api/blob/4c720974788cfe3b985b2bf228899cba265afde4/python/vosk/__init__.py#L89-L90

…Or, actually, since it looks like class vosk.Model() already includes search path logic anyway, why does vosk-cli implement its own model_path() resolution order?