alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
7.36k stars 1.04k forks source link

grammar in Kaldi #1534

Open YangangCao opened 3 months ago

YangangCao commented 3 months ago

Hi, dear author, setting grammar in Vosk is very useful, so I copy UpdateGrammarFst() to Kaldi and test it with open source chain model(http://kaldi-asr.org/models/m13), but the result is bad( I set faster_decodeopts.beam = 1000, I change other decoders and models, also bad), the background noise was be recognized as words, and some extra words. For example:

The speaker said "magnets can be found on a can opener". start end WORDS: 0.02 0.03 WORDS: 0.12 0.24 can WORDS: 0.24 0.52 WORDS: 1.4 2.06 magnets WORDS: 2.06 2.33 can WORDS: 2.33 2.45 be WORDS: 2.45 2.5 a WORDS: 2.5 2.93 found WORDS: 3 3.19 on WORDS: 3.19 3.26 a WORDS: 3.26 3.64 can WORDS: 3.64 4.2 opener

But I can get correct result in Vosk. Any solution to make model no such sensitive?

Or maybe setting grammar has existed in Kaldi, can you please give me some tips, Thanks!

nshmyrev commented 3 months ago

Feels like you have wrong self-loop-scale and probably acoustic weight.

YangangCao commented 3 months ago

I thnk I find the solution, because I don't use lookahead. #1509