Open alundgard opened 1 month ago
This Whisper output contains caption segments that are too long. Although they appear on the screen long enough to read them, accessibility guidelines recommend a max of 42 chars per line, and a max of 2 lines per segment.
Run different parameter combinations on a media test item and observe the vtt output formatting, attending to subtitle accessibility guidelines and readability.
Whisper.writer parameters
Subtitle formatting parameters are input to the writer (obtained by whisper.utils.get_writer), and not the model (model.transcribe). NB: To use these writer parameters,
word_timestamps
must be set to True as input to model.transcribe.max_line_width
: the maximum number of characters in a line before breaking the line (default: None)max_line_count
: the maximum number of lines in a segment (default: None)max_words_per_line
: the maximum number of words in a segment (default: None)Preliminary parameter testing and vtt output: Pre-pilot parameter testing (Local).
Questions
max_words_per_line
may improve readability by breaking caption segments on complete sentences (see the Relevant links below). Should we use this parameter instead ofmax_line_width=42
andmax_line_count=2
?Relevant links