ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
35.38k stars 3.61k forks source link

Grammar Bug: Sometimes words are only partially recognized #2496

Open LazoCoder opened 2 weeks ago

LazoCoder commented 2 weeks ago

Problem

If I have a word like escape in my grammar, sometimes whisper will output the first few letters esc instead of the whole word. The expected behavior is that only the entire word should be recognized.

How to Reproduce (example 1)

Go into examples/command and make a simple single line grammar root ::= " escape". Now if you say "escape" it will sometimes print out esc instead of the whole word escape. You can also try to say "essk" and that will also print out esc but the expected behavior would be to print nothing. This is an invalid command.

How to Reproduce (example 2)

Another example is to set the grammar to root ::= " caps". If you say "cap" it will print out cap (without the s). The expected behavior should be to print nothing because cap is an invalid command, only caps (with the s) should be accepted.

My Setup

I'm running examples/command with my custom grammar on a Window 10 machine via GPU/CUDA and I get the same problem whether I use ggml-small or ggml-large-v2.

Temporary Workaround Issue

I can remove invalid words in post processing but the problem is that these erroneous words prematurely cut off recognition of any other commands which should come after. For example, if I have a long list of commands like "please escape and log out", if escape is incorrectly outputted as esc then everything that comes after that command will be omitted from the output.

Notes

I noticed user @ulatekh also experienced this problem https://github.com/ggerganov/whisper.cpp/pull/2127#issuecomment-2148493982 https://github.com/ggerganov/whisper.cpp/discussions/2047#discussion-6496710. I created this issue as a response to this comment https://github.com/ggerganov/whisper.cpp/pull/2127#issuecomment-2154363819.