m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
12.72k stars 1.35k forks source link

Whisperx error : --highlight_words requires word_timestamps True #267

Open akashmamun610 opened 1 year ago

akashmamun610 commented 1 year ago

Whisperx can not highlight_words during translate.. it requires word_timestamps True.. Screenshot_20230520-191809

When i input word_timestamps True Screenshot_20230520-192239 then it does not work at all.. it said whisperx: error: unrecognized arguments: --word_timestamps True

sorgfresser commented 1 year ago

Did you use --no_align?

akashmamun610 commented 1 year ago

Thanks for your message.. My input was

!whisperx "/content/drive/MyDrive/Dvd" --language "ru" --model large-v2 --align_model "WAV2VEC2_ASR_LARGE_LV60K_960H" --task translate --max_line_count 2 --max_line_width 42 --highlight_words True

On Sat, 20 May 2023, 20:05 Simon, @.***> wrote:

Did you use --no_align?

— Reply to this email directly, view it on GitHub https://github.com/m-bain/whisperX/issues/267#issuecomment-1555918843, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4MMREBCY7VHEGD3RAAWQL3XHDFSHANCNFSM6AAAAAAYIXZRWM . You are receiving this because you authored the thread.Message ID: @.***>

sorgfresser commented 1 year ago

I think I got it. What's happening is that translate can't align as of now (which is not documented I think) and as such we set the no_align boolean to true which itself is incompatible with highlight_words (that's why I asked about it earlier). As such, highlight_words can only be used with transcribe right now. You sadly have to use transcribe and translate it afterwards.

akashmamun610 commented 1 year ago

Thanks for your reply @sorgfresser I think whisperx has translate issues.. hopefully it will be solved in future..

tophee commented 8 months ago

Can highlight_words somehow be used with python?