Open mrnemosaa opened 3 days ago
OK, this is whisper hallucination. Can you upload an sample audio file to reproduce it so I can test with some settings?
OK, this is whisper hallucination. Can you upload an sample audio file to reproduce it so I can test with some settings?
no I can't. Still it have a lots of convesation, but it has more "Ah" and "Oh" But thank you for your answer. I will find way to make less hallucination
@mrnemosaa Most helpful way would be to use VAD & BGM separation. Most of the hallucination is caused by the noise and background music from your audio.
Just turning on BGM separation filter alone will help alot.
I installed whisper AI in local I tried using the program to recognize the language of a 1 hour and 30-minute video in German, but after about 7 minutes, it only outputs 'Oh.' Even though there were many conversations after that, it consistently outputs the same 'Oh' at intervals of one second. The same issue occurs when using different models. Of course, there are many instances of 'Oh' in the video I'm recognizing, but there is still a fair amount of dialogue, so it's strange that it's not capturing that. Changing the model to a smaller size doesn't make any difference.