spaam / svtplay-dl

Small command-line program to download videos from some streaming sites.
https://svtplay-dl.se
MIT License
720 stars 120 forks source link

Subtitles downloaded from TV4 Play will sometimes be merged when they should not be #1589

Open andersm1019 opened 10 months ago

andersm1019 commented 10 months ago

svtplay-dl versions:

4.69

Operating system and Python version:

PRETTY_NAME="Raspbian GNU/Linux 10 (buster)" NAME="Raspbian GNU/Linux" VERSION_ID="10" VERSION="10 (buster)" VERSION_CODENAME=buster ID=raspbian ID_LIKE=debian

Python 2.7.16 Python3 3.7.3

What is the issue:

When downloading subtitles from TV4 Play, sometimes two consecutive subtitles with the same text will be combined as one when they should be separate subtitles.

Example: Episode 5 of Tinka och själens spegel svtplay-dl -S "https://www.tv4play.se/video/791ac92deef1e71c220e/flora"

What it should be like: 44 00:05:32,480 --> 00:05:35,680 -Hallå? -Det är jag.

45 00:05:57,920 --> 00:06:01,560 -Hallå? -Det är jag.

What svtplay-dl produces: (Note that this subtitle stays on for 29 seconds!) 44 00:05:32,480 --> 00:06:01,560 -Hallå? -Det är jag.

spaam commented 10 months ago

yeah i see the issue. 🤔 the issue is related to the split subtitles files and we look at the time code then the text. if the text is the same we just update the end time code ( as you can see ).

andersm1019 commented 10 months ago

I think that the solution would be to look at the end time of the first subtitle, and the start time of the next. Only combine them when the difference is zero or very small.