daanzu / kaldi-active-grammar

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
GNU Affero General Public License v3.0
332 stars 49 forks source link

Model and working directories should be separate #51

Open p-e-w opened 3 years ago

p-e-w commented 3 years ago

KaldiAG currently writes to several files in the model directory (such as file_cache.json, align_lexicon.int etc.), even when a separate temp directory is specified.

I think this breaks standard expectations of the model dir being a data repository, rather than a working directory for KaldiAG. It might also lead to hard-to-debug issues if multiple instances of KaldiAG are using the same model directory. When installing KaldiAG models for all users on a Linux system (e.g. using a package manager), they will likely be located under /usr/share, and will be read-only for unprivileged users, which again will lead to failure.

The best approach IMO would be to allow the user to specify a "working directory" when constructing a Compiler object (the default could be the model directory as it is now). This will enable a clean separation of immutable model data and mutable working cache if the application or the installation environment requires it.

daanzu commented 3 years ago

Yes, the current implementation is not something I am happy with. There are three categories of files: completely static for a given model, only necessary to rebuild when the lexicon is changed, and grammar-specific. On the one hand, I don't want to make things too complicated, with extra directories. On the other hand, as you said, it can be problematic to mix the various categories of files. I'm planning on re-organizing the model structure to make it better suited, and separate the categories more cleanly. Probably like how the latest version handles the cache, but moving all of the non static lexicon files into the cache directory as well.