EtienneAb3d / WhisperHallu

Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
281 stars 22 forks source link

Use FFMPEG silenceremove and VAD #5

Closed sorgfresser closed 1 year ago

sorgfresser commented 1 year ago

Wonderful pipeline, thanks a lot for your great work! It was a bit painful to set up in my case but now it works like a charm. Still I am curious about the usage of silenceremove and VAD together. Shouldn't silero-vad be capable of doing silenceremove's job too? Surely I am mistaken and there is a good reason, I just want to know which.

EtienneAb3d commented 1 year ago

Hi @sorgfresser, Thanks for your positive feedback! Yes, VAD is supposed to be able to remove pure silences, but it doesn't. Using both ensure a better silence+noise removal. It's especially important since hallucinations often occur on silences. ;-)

sorgfresser commented 1 year ago

Thanks a lot!