daanzu / kaldi-active-grammar

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
GNU Affero General Public License v3.0
334 stars 49 forks source link

Import vocabulary #9

Open kendonB opened 4 years ago

kendonB commented 4 years ago

Would be neat to be able to import vocabularies from DNS. When exported, they are of the format:

<text_to_print>\\<spoken phrase>
<text_to_print>

Where if spoken phrase isn't specified it is implied that the text and spoken phrase are the same.

Here's a snippet from my real one:

Guo & Costello\\Guo and Costello
Hoel & Sterner\\whole and sterner
Hsiang\\Solomon Chung last name
htop
iffae
insolation\\word insulation
kia ora\\kyah order
kia ora koutou
kmbell56@gmail.com\\my gmail
knitr\\knitter
Kupe
Landcare Research
LaTeX
Lebesque
LINZ

Is this feasible in Kaldi?

daanzu commented 4 years ago

That should be trivial. I will add it soon. Thanks for the suggestion!

daanzu commented 4 years ago

Upon reflection, there are some complexities for some of these, with multi-word phrases.

Also, I forgot to mention, adding them to the language model for dictation is significantly complex. How important is that for your usage?

kendonB commented 4 years ago

adding them to the language model for dictation is significantly complex

I'm not sure what this means. How do you use these in the vocabulary without adding them to the language model?

daanzu commented 4 years ago

adding them to the language model for dictation is significantly complex

I'm not sure what this means. How do you use these in the vocabulary without adding them to the language model?

You can use them in commands. The dictation language model it is only used in dictation elements.

kendonB commented 4 years ago

Being able to add vocabulary to the language model is a really useful feature of DPI/DNS. The workflow is super simple as well:

  1. "Open Vocabulary Editor"
  2. "Add" then fill in the actual word as well as the pronunciation.
daanzu commented 4 years ago

Sorry about the delay in implementing this. Getting it integrated with the dictation language model is nontrivial. I've had a hacky version working for a while but am still working on cleaning it up.

daanzu commented 3 years ago

An example of a hacky partial hopefully-temporary solution to the problem can be seen here: https://github.com/daanzu/kaldi-grammar-simple/blob/master/_dictation.py

daanzu commented 3 years ago

Experimental implementation: https://github.com/dictation-toolbox/dragonfly/pull/284

I could easily write an importer from Dragon's file format, if desired.

kendonB commented 3 years ago

Awesome! A Dragon file format importer would certainly help adoption of Kaldi.

Thinking forward, would the intention be to replace the Dictation element in dragonfly? So rather than calling

from dragonfly.engines.backend_kaldi.dictation import UserDictation as Dictation

we would call

from dragonfly import Dictation

?