Reduce the number of recognizable words

alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Apache License 2.0

7.38k stars 1.04k forks source link

Reduce the number of recognizable words #1451

Open laiyuzhi opened 8 months ago

laiyuzhi commented 8 months ago

Dear author. I have a small project and only need to recognize a limited number of words. Is there any way to change the vosk word list? I'm hoping to somehow reduce the number of word list so that multiple similar sounds produce the same word. （both pronunciation of poor and pour can generate text “pour”）

nshmyrev commented 8 months ago

You can modify lm as described in https://alphacephei.com/vosk/lm

You can also use grammar constructor as in test_words.py

laiyuzhi commented 8 months ago

Thanks for your reply. Is there any way to make the model output only a single word at a time? ex. Before: one one two one two three text: one two three Now: text: one text:two text:three

nshmyrev commented 8 months ago

Its easier to post-process output than to force recognizer to recognize not what being said.