Open kendonB opened 4 years ago
That should be trivial. I will add it soon. Thanks for the suggestion!
Upon reflection, there are some complexities for some of these, with multi-word phrases.
Also, I forgot to mention, adding them to the language model for dictation is significantly complex. How important is that for your usage?
adding them to the language model for dictation is significantly complex
I'm not sure what this means. How do you use these in the vocabulary without adding them to the language model?
adding them to the language model for dictation is significantly complex
I'm not sure what this means. How do you use these in the vocabulary without adding them to the language model?
You can use them in commands. The dictation language model it is only used in dictation elements.
Being able to add vocabulary to the language model is a really useful feature of DPI/DNS. The workflow is super simple as well:
Sorry about the delay in implementing this. Getting it integrated with the dictation language model is nontrivial. I've had a hacky version working for a while but am still working on cleaning it up.
An example of a hacky partial hopefully-temporary solution to the problem can be seen here: https://github.com/daanzu/kaldi-grammar-simple/blob/master/_dictation.py
Experimental implementation: https://github.com/dictation-toolbox/dragonfly/pull/284
I could easily write an importer from Dragon's file format, if desired.
Awesome! A Dragon file format importer would certainly help adoption of Kaldi.
Thinking forward, would the intention be to replace the Dictation
element in dragonfly? So rather than calling
from dragonfly.engines.backend_kaldi.dictation import UserDictation as Dictation
we would call
from dragonfly import Dictation
?
Would be neat to be able to import vocabularies from DNS. When exported, they are of the format:
Where if spoken phrase isn't specified it is implied that the text and spoken phrase are the same.
Here's a snippet from my real one:
Is this feasible in Kaldi?