ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
35.38k stars 3.61k forks source link

Decoding Repetitions #2446

Open ineiti opened 4 weeks ago

ineiti commented 4 weeks ago

As #612 is closed, I open a new one, as I had the same problem. I'm using the latest main branch from 2024-10-02 with the large-v3 model. I played around with the parameters and found the following:

I had some success with playing around on the context. Sometimes -mc 50 worked OKish, -mc 100 gives better quality, but -mc 1000 fails often on my audio.

What mostly worked on all my files (4h of conference files with speakers and round-tables, all in English) so far is

-mc 50 -bs 6 -bo 6 -tp 0.2

But the quality is definitely lower than with -mc 0. Except that it doesn't have that many repetitions... Some were still there :(