Closed ubanning closed 1 year ago
Thank you @ubanning for the suggestion.
Just to make sure I got your idea, you are suggesting to use the word timestamps to cut long segments into several small ones (based on the number of characters per segment), right? For instance instead of hainvg
36
00:00:10,220 --> 00:00:18,260
Giving away to Grealish, whose camp is clear of Hoibier, and goes over Romero, who is going to walk.
we could have
36
00:00:10,220 --> 00:00:14,840
Giving away to Grealish, whose camp is clear of Hoibier,
37
00:00:14,840 --> 00:00:18,260
and goes over Romero, who is going to walk.
I'm thinking that the best way to achieve this would be to do it in a separate script that would take as input the .words.json
files generated by whisper_timestamped
, and produce SRT / VTT files (using a max_line_length option).
Would you feel comfortable with that option?
Exactly, this is what I would like. Your example is perfect.
It can be the way you feel most comfortable and think it's better 😊 I think maybe something that could complicate it is something related to the punctuation, so that it doesn't get cut in half. (which is not the case in your example, but which may happen in the future)
Thanks for the help and the answer.
OK I made a script in whisper_timestamped/make_subtitles.py
(which can is called whisper_timestamped_make_subtitles
after "setup install") and that takes the words.json
files produced by "whisper_timestamped" to produce SRT and/or VTT files with a maximum character length for all the segments, and a preference to cut after punctuation marks (as you suggest):
# whisper_timestamped_make_subtitles -h
usage: whisper_timestamped_make_subtitles [-h] [--max_length MAX_LENGTH] [--format {srt,vtt,all}] input output
Convert .word.json transcription files (output of whisper_timestamped) to srt or vtt, being able to cut long segments
positional arguments:
input Input json file, or input folder
output Output srt or vtt file, or output folder
optional arguments:
-h, --help show this help message and exit
--max_length MAX_LENGTH
Maximum length of a segment in characters (default: 200)
--format {srt,vtt,all}
Output format (if the output is a folder, i.e. not a file with an explicit extension) (default: all)
Feel free to produce any feedback.
Thanks @ubanning
Hello @Jeronymous,
I'm trying to create shorter subtitles with your instructions but I'm currently failing.
Here's my input:
whisper_timestamped_make_subtitles --max_length 43 example.words.json ./vtt
And here's the result:
Traceback (most recent call last): File "C:\Users\dorian.baret\AppData\Local\Programs\Python\Python39\Scripts\whisper_timestamped_make_subtitles-script.py", line 33, in <module> sys.exit(load_entry_point('whisper-timestamped==1.9.1', 'console_scripts', 'whisper_timestamped_make_subtitles')()) File "c:\users\dorian.baret\appdata\local\programs\python\python39\lib\site-packages\whisper_timestamped\make_subtitles.py", line 121, in cli segments = split_long_segments(segments, args.max_length, use_space=use_space) File "c:\users\dorian.baret\appdata\local\programs\python\python39\lib\site-packages\whisper_timestamped\make_subtitles.py", line 22, in split_long_segments assert len(words) == len(meta_words) AssertionError
Is there something wrong with my input, or is this a bug perhaps?
Thanks for reporting with all the details.
Your command line seems to be correct, so it's seems to be a bug (corner case not well handled).
Is there a chance that you can provide the example.words.json
?
(you can enclose it here in a zip)
Bonjour Jérôme,
Thank you for your answer, unfortunately I can't share the files as it's non-public corporate content.
However I think I've identified the issue: the script fails every time I fetch it a video in French. The videos from my company are in French, and I also tried using some videos from YouTube in French (three different ones, from different channels), and they all fail.
Spanish works fine, so it doesn't seem to be all videos that are not in English.
I am French and tested the stuff quite thoroughly in French (with things like "Dis-moi, est-ce que l'avion vole?"), so it does not help... I would be surprised that it fails for you for any video in French...
Anyway, I've just pushed a fix, that should also print you a "WARNING: xxx != yyy" for these corner cases that I don't understand. If you can share one of these corner cases (anonymizing some parts if necessary), I would appreciate so that I can understand.
I also tried using some videos from YouTube in French (three different ones, from different channels)
Or maybe you can share the "words.json" files for these videos that do not seem to be non-public
I've just updated and it does seem to work now, except that there now seems to be an issue with the encoding of accents.
Here's a corner case (with wonky accents):
WARNING: Je peux cliquer dessus pour les masquer, ou je peux directement à partir de là , sélectionner tout ce qui est usinable ou pas usinable. != Je peux cliquer dessus pour les masquer, ou je peux directement à partir de là , sélectionner tout ce qui est usinable ou pas usinable.
And here are two of the JSON files that failed this morning: words_whisper_timestamped_french.zip
oh I see, your default encoding is not utf8 and I was not explicitly setting the encoding of the file when reading/writing in make_subtitles.py
It should be fixed now.
(and it was not failing for me on the json you sent, so I guess it was also an encoding issue)
Many thanks!
Works like a charm now, thank you!
Works perfectly many thanks :+1:
Hello, first of all thanks for your work, I'm here to give a suggestion.
With word-level timestamps I think it would be possible to add a character limit per line/time in SRT and VTT subtitle files without using a simple line break.
Depending on the audio, the characters can exceed 200+ per line and I believe this problem can be fixed with this implementation.
If it's not possible to add this parameter, when you have time, could you provide me with some code that would make this idea work? (I'm not from the programming area and I have a little difficulty)
Here's a discussion on the subject on Whisper so you can understand a little better: Improve default line lengths in subtitle files
Thanks.