This pull request is the result of some of the points debated at today's Stichting Openspraaktechnologie meeting.
It contains:
Various edits and improvements in all of the configuration scripts
implemented a generic way to download contributed models and contributed decode scripts; contributions are stored in the contrib/ directory and are separated from the generic core of kaldi NL.
Integrated the work done at CLST, Radboud University Nijmegen (by @schemreier) as one of those optional contributed models. This pull requests effectively ends our need to keep a separate (private) fork as we have done hitherto.
various fixes/improvements
better dependency checks
Documentation improved to reflect the new situation (README.md)
with explicit licensing paragraph
with information on how to contribute own models (CONTRIBUTING.md)
More technical details:
Moved the model download logic out of configure_basic.sh into configure_download.sh (which was empty)
Some extra sanity checks and hardening in the decode.sh template
path.sh should not force a KALDI_ROOT, it now always reuses the one from the environment if available and calls appropriate host-specific scripts or a path.custom.sh if it can't find kaldi yet. This allows for better host independent setups and integration into other deployment/distribution mechanisms such as for example LaMachine.
note: some of the earlier commits got superseded by some of the later ones, I could have squashed/rebased commits but I kept the history as is so it may be a bit verbose. Best to judge the end result rather than individual commits.
There are no functional changes to the actual decoding scripts or ASR configurations (those are outside my area of expertise anyway).
I hope @roelandordelman and @henkvdheuvel or louis ten bosch, can review some of these changes whether they are in line with what we discussed? Input from @laurensw75 would of course be highly appreciated as it's primarily his work in the first place!
This pull request is the result of some of the points debated at today's Stichting Openspraaktechnologie meeting.
It contains:
contrib/
directory and are separated from the generic core of kaldi NL.README.md
)CONTRIBUTING.md
)More technical details:
path.custom.sh
if it can't find kaldi yet. This allows for better host independent setups and integration into other deployment/distribution mechanisms such as for example LaMachine.There are no functional changes to the actual decoding scripts or ASR configurations (those are outside my area of expertise anyway).
I hope @roelandordelman and @henkvdheuvel or louis ten bosch, can review some of these changes whether they are in line with what we discussed? Input from @laurensw75 would of course be highly appreciated as it's primarily his work in the first place!