alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
7.35k stars 1.04k forks source link

remove vocabulary from the model #1553

Closed baptxste closed 2 months ago

baptxste commented 2 months ago

Hi, in order to improve the model on a specific task, I have updated the vocabulary and the language model but I still get mis-prediction on few words. For instance I often get 'udr' instead of 'edr'. In my case I will never have to use the word 'udr' and I was wondering if it was possible to delete this word from the dictionnary to avoid this type of misprediction. Do I simply need to remove it from the fr.dic and fr.vocab an then recompile the graph or is there another way to do this.

nshmyrev commented 2 months ago

If you remove words from vocab and dictionary and recompile the package, it should work, yes.