The acoustic model format has diverged somewhat from CMU Sphinx, and is expected to diverge further. Supporting multiple model formats is not consistent with the goal of making the smallest possible library, so we require a converter to be able to use publically available models. Currently this means:
Convert text to binary model definition
Convert mixture_weights to sendump
Rename text files to include ".txt" extension
Convert feat_params to JSON
Include default dictionary
In the future it may mean (but this is not in the scope of this issue):
Dictionary is an FST and may be a G2P model
Model definition is also an FST (i.e. the "HC" in "HCLG")
The acoustic model format has diverged somewhat from CMU Sphinx, and is expected to diverge further. Supporting multiple model formats is not consistent with the goal of making the smallest possible library, so we require a converter to be able to use publically available models. Currently this means:
mixture_weights
tosendump
feat_params
to JSONIn the future it may mean (but this is not in the scope of this issue):