m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
11.96k stars 1.26k forks source link

Limit the length of my subtitles #651

Open Root-FTW opened 9 months ago

Root-FTW commented 9 months ago

How can I make my subtitles 1 line long and no more than 5 words?

This is the command I use on my windows with Nvidia GTX 1080 graphics card and the result is multiple lines of subtitles and many words resulting in a super long and long paragraph to read.

whisperx video.mp4 --model large-v3 --highlight_words False --align_model WAV2VEC2_ASR_LARGE_LV60K_960H --compute_type int8 --max_line_width 42 --max_line_count 2 --output_format srt

jim60105 commented 9 months ago

Maybe --chunk_size is what you need. https://github.com/m-bain/whisperX/blob/f8cc46c6f7fa3b8509bc6aa04cdf4a62a702bb42/whisperx/transcribe.py#L44

stri8ed commented 7 months ago

Maybe --chunk_size is what you need.

https://github.com/m-bain/whisperX/blob/f8cc46c6f7fa3b8509bc6aa04cdf4a62a702bb42/whisperx/transcribe.py#L44

Wouldn't this cause worse transcription results, since the context is lost with smaller chunks of audio? Seems like it would make more sense to split it after transcription.