Kaldi is too fast - Githubissues

daanzu / kaldi-active-grammar

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

GNU Affero General Public License v3.0

336 stars 50 forks source link

Kaldi is too fast #36

Open kendonB opened 4 years ago

kendonB commented 4 years ago

I know that this is a very good problem to have! Several times (at least since the last update) I have found Kaldi will interpret my speech well before I'm done uttering a phrase. For example, in Caster we can say "go line sixteen" to navigate to line sixteen.

Sometimes the system will interpret this as "go line six" then interpret "teen" as something completely different as Kaldi almost always trys to interpret any real speech. For me, I get "teen" interpreted as "doon" (my word for page down), for example.

Is there a setting we can tweak to prevent this from happening?

dwks commented 4 years ago

Try vad_padding_end_ms in kaldi_module_loader_plus.py. I myself have set it to 200ms up from 100 to avoid this problem.

daanzu commented 4 years ago

Yeah, adjusting vad_padding_end_ms will help. Perhaps I was over aggressive in setting the default to 150. Also, you can try adjusting vad_aggressiveness from the default 3 down to 2 or 1, which should make it less likely to cut off quiet sounds. I should probably have a section for "parameters you will likely want to adjust for preference" like this.

kendonB commented 4 years ago

@daanzu where do these parameters sit in the kaldi_model folder? Or are those settings fixed in the releases?

daanzu commented 4 years ago

They are engine parameters for dragonfly, so they are set in the get_engine() call, which should be in your loader. So something like this: get_engine('kaldi', vad_padding_end_ms=200). How are you running things?

kendonB commented 4 years ago

@daanzu i think the calls in Caster are https://github.com/dictation-toolbox/Caster/blob/master/castervoice/lib/ctrl/mgr/engine_manager.py and https://github.com/dictation-toolbox/Caster/blob/master/castervoice/lib/ctrl/configure_engine.py

@LexiconCode do you know how we might expose this option in Caster?

LexiconCode commented 4 years ago

Yep it's something I've been trying to think how the best implement.

daanzu commented 4 years ago

https://github.com/dictation-toolbox/dragonfly/pull/302 may help until Caster has a good way to pass engine parameters.

LexiconCode commented 3 years ago

Just modify the bat file for now python -m dragonfly load _*.py --engine kaldi --no-recobs-messages --engine-options "model_dir=kaldi_model, vad_padding_end_ms=300"

List of kaldi engine parameters for configuration.