m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
12k stars 1.26k forks source link

Failed to align segment (" you"): backtrack failed, resorting to original... #404

Open YueBCM opened 1 year ago

YueBCM commented 1 year ago

Hi, I just started using WhispherX on a short (~5min) HCP movie for the word-level timestamp. I followed the GitHub steps and installed it in Python environment, and tried the following command:

whisperx $HOME/Desktop/HCPmovie.wav --compute_type int8 --align_model WAV2VEC2_ASR_LARGE_LV60K_960H --language en

And got the error:

Performing transcription... Performing alignment... Failed to align segment (" you"): backtrack failed, resorting to original...

Please help!!!

Thanks!

-Yue

sorgfresser commented 1 year ago

No error, just a warning, simply ignore it. No need to do anything, happens sometimes since wav2vec isn't as good as whisper. You could try using a better alignment model, but it's not really something to worry about.