Open kendonB opened 4 years ago
Try vad_padding_end_ms in kaldi_module_loader_plus.py. I myself have set it to 200ms up from 100 to avoid this problem.
Yeah, adjusting vad_padding_end_ms
will help. Perhaps I was over aggressive in setting the default to 150. Also, you can try adjusting vad_aggressiveness
from the default 3 down to 2 or 1, which should make it less likely to cut off quiet sounds. I should probably have a section for "parameters you will likely want to adjust for preference" like this.
@daanzu where do these parameters sit in the kaldi_model
folder? Or are those settings fixed in the releases?
They are engine parameters for dragonfly, so they are set in the get_engine()
call, which should be in your loader. So something like this: get_engine('kaldi', vad_padding_end_ms=200)
. How are you running things?
@daanzu i think the calls in Caster are https://github.com/dictation-toolbox/Caster/blob/master/castervoice/lib/ctrl/mgr/engine_manager.py and https://github.com/dictation-toolbox/Caster/blob/master/castervoice/lib/ctrl/configure_engine.py
@LexiconCode do you know how we might expose this option in Caster?
Yep it's something I've been trying to think how the best implement.
https://github.com/dictation-toolbox/dragonfly/pull/302 may help until Caster has a good way to pass engine parameters.
Just modify the bat file for now python -m dragonfly load _*.py --engine kaldi --no-recobs-messages --engine-options "model_dir=kaldi_model, vad_padding_end_ms=300"
List of kaldi engine parameters for configuration.
I know that this is a very good problem to have! Several times (at least since the last update) I have found Kaldi will interpret my speech well before I'm done uttering a phrase. For example, in Caster we can say "go line sixteen" to navigate to line sixteen.
Sometimes the system will interpret this as "go line six" then interpret "teen" as something completely different as Kaldi almost always trys to interpret any real speech. For me, I get "teen" interpreted as "doon" (my word for page down), for example.
Is there a setting we can tweak to prevent this from happening?