EtienneAb3d / WhisperHallu

Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
275 stars 22 forks source link

The "segment" time is wrong from the real time when the silence is removed #18

Open NguyenDucLamK63 opened 1 year ago

NguyenDucLamK63 commented 1 year ago

I run the same as you, the results are good, but the "segment" time is wrong compared to the real time of the audio. So is there a way when I run the code, the silence is still removed, but the text segment time is still the same as the audio time? Hope you can help. Thank you very much

EtienneAb3d commented 1 year ago

@NguyenDucLamK63

  1. use WhisperHallu options to produce both transcriptions, with and without silence/noise removal.
  2. use WhisperTimeSync to send the good timestamps over the good text. https://github.com/EtienneAb3d/WhisperTimeSync