xhlove / dash-subtitle-extractor

Dash Embeded Stream Subtitle Extractor
54 stars 13 forks source link

ValueError: substring not found #5

Open VCDEV1 opened 1 year ago

VCDEV1 commented 1 year ago

Hi again,

First of all thanks for you tool again! Since the last error was fixed the tool worked perfectly with all the subtitles I needed to download but another subtitle from the same website seems to be having an error again.

The mpd link of the dash subtitle I am trying to download is: https://streaming-vod-aha.akamaized.net/content/pubcontent/vol/A68ACF9B-AEB6-4A6F-8E93-3DEE76881479/1658819319562-output_cenc_dash.ism/index.mpd?pipeline=ttml_removal&filter=%28%28type%3D%3D%22video%22%26%26FourCC%3D%3D%22AVC1%22%29%7C%7C%28type%3D%3D%22audio%22%26%26FourCC%3D%3D%22AACL%22%29%7C%7C%28type%3D%3D%22textstream%22%29%29 (Link is still valid)

I tried this with two methods same as previously I did

  1. By making a folder (after splitting the mp4 container) with the init file and the segments. I tried with both ttml and wvtt formats. The result is shown below.

image

  1. By making a folder only with the init and the mp4 container (without splitting the mp4 container). I again tried with both of the ttml and wvtt formats but still an error message was shown. But this time the error messages were different a little. The result is shown below.

image

I was using the latest code and the files I used will be attached below!

aha-with-segments.zip aha-with-mp4-container.zip

If you can look into this a little again and add support for this it will be really helpful. Thank you.

xhlove commented 1 year ago

try https://github.com/nilaoda/Mp4SubtitleParser

VCDEV1 commented 1 year ago

@xhlove It seems to be working. But I am mainly using your tool with my script directly in linux. Since that has only an exe file I am unable to add it to my script since it is not on windows. If you can and have time can you please take a look and fix it if possible. The main error seems to be on the first screenshot which is the ValueError: substring not found

VCDEV1 commented 1 year ago
  1. By making a folder (after splitting the mp4 container) with the init file and the segments. I tried with both ttml and wvtt formats. The result is shown below.

image

It seems like the error occurs in pyshaka\text\VttTextParser.py line 91, when the subtitle payload character has a "/" as an normal character instead of a colour tag. For an example, in this it seems to be happening because the subtitle I am trying to download has the sentence "take 300/-." in line 365. Here is the converted subtitle which is given by the Mp4SubtitleParser :- https://transfer.sh/lmptGJ/output.vtt

xhlove commented 1 year ago

fixed