Model and working directories should be separate

daanzu / kaldi-active-grammar

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

GNU Affero General Public License v3.0

332 stars 49 forks source link

KaldiAG currently writes to several files in the model directory (such as file_cache.json, align_lexicon.int etc.), even when a separate temp directory is specified.

I think this breaks standard expectations of the model dir being a data repository, rather than a working directory for KaldiAG. It might also lead to hard-to-debug issues if multiple instances of KaldiAG are using the same model directory. When installing KaldiAG models for all users on a Linux system (e.g. using a package manager), they will likely be located under /usr/share, and will be read-only for unprivileged users, which again will lead to failure.

The best approach IMO would be to allow the user to specify a "working directory" when constructing a Compiler object (the default could be the model directory as it is now). This will enable a clean separation of immutable model data and mutable working cache if the application or the installation environment requires it.

daanzu / kaldi-active-grammar

Model and working directories should be separate #51