faissaloo / SponSkrub

Strip advertisements from downloaded YouTube videos
GNU General Public License v3.0
177 stars 7 forks source link

Subtitles are misaligned after cutting #21

Open faissaloo opened 3 years ago

faissaloo commented 3 years ago

After cutting a video subtitles end up misaligned, we'll need to extract them like so: https://superuser.com/a/927507/889417 then figure out how to cut them and add them to the new video

Example subtitle file, downloaded via youtube-dl --all-subs --skip-download https://www.youtube.com/watch\?v\=Ye8mB6VsUHw

WEBVTT
Kind: captions
Language: en

00:00:00.832 --> 00:00:04.480
COOKIE MONSTER: Now, what starts
with the letter C?

00:00:04.480 --> 00:00:08.230
Cookie starts with C. Let's
think of other things that
pukkandan commented 3 years ago

This may be quite difficult to implement fully since the youtube-dl supports embedding of multiple subtitle formats (srt, vtt, ttml, srv etc)

faissaloo commented 3 years ago

Yeah, either I'll have to write parsers for each format or see if I can get youtube-dl to force them into a single format.
Based on what I've been thinking regarding #22 we probably shouldn't need to be parsing subtitles directly and it seems like ffmpeg can cut subs? https://stackoverflow.com/questions/21554541/cut-parts-of-subtitle-file-using-ffmpeg

pukkandan commented 3 years ago

I can get youtube-dl to force them into a single format.

Yes, you can. There is --convert-subs. But the user might not want to.

Also, something I forgot to mention is my previous comment: the video may also contain multiple subs; each possibly in a different format.

we probably shouldn't need to be parsing subtitles directly and it seems like ffmpeg can cut subs? https://stackoverflow.com/questions/21554541/cut-parts-of-subtitle-file-using-ffmpeg

huh, interesting. Though that thread doesn't mention about more complex filters, it seems it might be worth looking into. That would help avoid a lot of headaches of trying to implement a full subtitle parser