Open RomanLeo2003 opened 6 months ago
and silance ... same problems...
have you managed to understand the cause of the problem? thank you
and silance ... same problems...
have you managed to understand the cause of the problem? thank you
Hi! Yes, I've searched for similar issues in many other repositories and found out that it's just a bug in Whisper. Whisper "forgets" to do punctuation and capitalization, and we can "remind" it by using a prompt with punctuation and capitalization.
Using a prompt can introduce instability in the final result (hallucinations and other issues), so I refuse this possibility because of it. However, you may try and experiment with it!
After transcribing several audio files using medium model, I have noticed that the transcriptions lack capitalization and punctuation. For example:
Transcribed text with punctuation and capitalization: "Produces, for example, a Renault headlight. They say, yes, yes, we produce it."
Transcribed text without punctuation and capitalization: "produces for example a renault headlight they say yes yes we produce it"
I suspect that this issue might be due to some accumulated cache in the model (or something similar). This problem seems to occur with certain types of content, but I am not sure. BTW, sometimes the problem fixes itself after a few minutes of audio. Therefore, my questions are:
Why does this happen? How can I fix it?
I use this configuration of parameters: