Open kerolos opened 8 months ago
Do all those words ("one zero one two three oh," "four five six," "seven eight nine zero," "[unk]") need to be present in the lexicon HCL.fst (words.txt)?
Yes
Should we create a phone sequence to enable the recognizer to handle these unknown words?
only as a separate compilation step
Is it possible to feed this list of sequence sentences or words from a file rather than hard-coding them as a list?
No, you have to load them in json object
How many phrases can be added in real-time without significantly affecting the overall recognition accuracy (optimal number ~)?
Usually several hundreds. For bigger vocabulary update it is better to recompile LM as in https://alphacephei.com/vosk/lm
Thanks for your quick response :) Another questions:
Is there any way to add new words on the fly during recognition (instead of create a new graph) ?
if not, Is there any way to implement that ?
Is there any way to add new words on the fly during recognition (instead of create a new graph) ?
No
if not, Is there any way to implement that ?
No easy way in Vosk unfortunately. Otherwise there are publications like https://arxiv.org/abs/2003.09024
Issue Title:
Questions about Adding Phrases in the Dynamic Graph Update
Issue Description:
In this example, we demonstrate how to add phrases in the dynamic graph update using the following code snippet from the Vosk API:
Python
rec.SetGrammar('["one zero one two three oh", "four five six", "seven eight nine zero", "[unk]"]') or: rec = KaldiRecognizer(model, wf.getframerate(),'["one zero one two three oh", "four five six", "seven eight nine zero", "[unk]"]')
C++
void Recognizer::UpdateGrammarFst(char const *grammar)
My first question is: Do all those words ("one zero one two three oh," "four five six," "seven eight nine zero," "[unk]") need to be present in the lexicon HCL.fst (words.txt)? Should we create a phone sequence to enable the recognizer to handle these unknown words?
Is it possible to feed this list of sequence sentences or words from a file rather than hard-coding them as a list?
My second question is: How many phrases can be added in real-time without significantly affecting the overall recognition accuracy (optimal number ~)?