Closed bnw42 closed 1 year ago
A bug report has been posted to the youtube-dl site
link?
I can reproduce. Am I misinterpreting something, or is it strange that youtubedl_smuggle
is in the url it tries to extract?
yt-dlp https://www.sbs.com.au/ondemand/watch/1823195203548 --verbose
[debug] Command-line config: ['https://www.sbs.com.au/ondemand/watch/1823195203548', '--verbose']
[debug] User config "C:\Users\jaybu\AppData\Roaming\yt-dlp\config.txt": ['--ffmpeg-location', 'C:\\Users\\jaybu\\ffmpeg\\bin', '-P', 'C:\\Users\\jaybu\\youtube.dl']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2023.03.04 [392389b7d] (win_exe)
[debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-10-10.0.22621-SP0 (OpenSSL 1.1.1k 25 Mar 2021)
[debug] exe versions: ffmpeg N-106498-g854615adf2-20220405 (setts), ffprobe N-106498-g854615adf2-20220405, phantomjs 2.1.1
[debug] Optional libraries: Cryptodome-3.17, brotli-1.0.9, certifi-2022.12.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-10.4
[debug] Proxy map: {}
[debug] Loaded 1786 extractors
[SBS] Extracting URL: https://www.sbs.com.au/ondemand/watch/1823195203548
[SBS] 1823195203548: Downloading JSON metadata
[ThePlatform] Extracting URL: http://link.theplatform.com/s/Bgtm9B/uOlBz65O39ey?feed=Video%20-%20Single&mbr=true&manifest=m3u&ord=4446663&policy=11929623&dfptag=sz%3D530x298%26iu%3D%2F4117%2Fvideo.entertainment.sbs.com.au%2Fsec30htmlweb%26ciu_szs%26impl%3Ds%26gdfp_req%3D1%26env%3Dvp%26output%3Dxml_vast2%26unviewed_position_start%3D1%26url%3Dwww.sbs.com.au%26description_url%3DSBS%26cust_params%3Dtype%253Dpreroll%26ad_rule%3D0%26cmsid%3D531%26nofb%3D1%26url%3Dhttp%253A%252F%252Fwww.sbs.com.au%252Fondemand%252Fvideo%252Fsingle%252F1823195203548%26description_url%3DSBS%26correlator%3D--ORD--%26vid%3D1823195203548&dfpmidtag=sz%3D530x298%26iu%3D%2F4117%2Fvideo.entertainment.sbs.com.au%2Fsec30midrollhtmlweb%26ciu_szs%26impl%3Ds%26gdfp_req%3D1%26env%3Dvp%26output%3Dxml_vast2%26unviewed_position_start%3D1%26url%3Dwww.sbs.com.au%26description_url%3DSBS%26cust_params%3Dtype%253Dmidroll%26ad_rule%3D0%26cmsid%3D531%26nofb%3D1%26url%3Dhttp%253A%252F%252Fwww.sbs.com.au%252Fondemand%252Fvideo%252Fsingle%252F1823195203548%26description_url%3DSBS%26correlator%3D--ORD--%26vid%3D1823195203548#__youtubedl_smuggle=%7B%22force_smil_url%22%3A+true%7D
[ThePlatform] uOlBz65O39ey: Downloading SMIL data
[ThePlatform] uOlBz65O39ey: Downloading MPD manifest
WARNING: [ThePlatform] Failed to download MPD manifest: HTTP Error 403: Forbidden
[ThePlatform] uOlBz65O39ey: Downloading JSON metadata
ERROR: [ThePlatform] uOlBz65O39ey: No video formats found!; please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U
Traceback (most recent call last):
File "yt_dlp\YoutubeDL.py", line 1518, in wrapper
File "yt_dlp\YoutubeDL.py", line 1615, in __extract_info
File "yt_dlp\YoutubeDL.py", line 1727, in process_ie_result
File "yt_dlp\YoutubeDL.py", line 1674, in process_ie_result
File "yt_dlp\YoutubeDL.py", line 2615, in process_video_result
File "yt_dlp\YoutubeDL.py", line 1046, in raise_no_formats
yt_dlp.utils.ExtractorError: [ThePlatform] uOlBz65O39ey: No video formats found!; please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U
Workaround: use the android HLS
diff --git a/yt_dlp/extractor/sbs.py b/yt_dlp/extractor/sbs.py
index 45320339d..eb4b918f5 100644
--- a/yt_dlp/extractor/sbs.py
+++ b/yt_dlp/extractor/sbs.py
@@ -91,12 +91,11 @@ def _real_extract(self, url):
raise ExtractorError('%s said: %s' % (self.IE_NAME, error_message), expected=True)
urls = player_params['releaseUrls']
- theplatform_url = (urls.get('progressive') or urls.get('html')
+ theplatform_url = (urls.get('progressive') or urls.get('htmlandroid')
or urls.get('standard') or player_params['relatedItemsURL'])
return {
'_type': 'url_transparent',
- 'ie_key': 'ThePlatform',
'id': video_id,
'url': smuggle_url(self._proto_relative_url(theplatform_url), {'force_smil_url': True}),
'is_live': player_params.get('streamType') == 'live',
I don't expect this to be committed, since the Android HLS could be lower quality (I haven't verified this). As such, I didn't put any effort into this code. If my workaround is deemed acceptable, feel free to ping me (on Discord), and I will happily make a proper PR
gamer191 - I don't know where "smuggle" came from but "youtubedl" appears to have come from your user configuration file, viz:
[debug] User config "C:\Users\jaybu\AppData\Roaming\yt-dlp\config.txt": ['--ffmpeg-location', 'C:\Users\jaybu\ffmpeg\bin', '-P', 'C:\Users\jaybu\youtube.dl']
I can't check ATM (at work behind a heavy duty corp firewall), but when I played something last night in a browser (after logging in) and then using a video stream detector, I was able to get the master manifest which yt-dlp was able to happily download. From memory it was via akamai. So the issue isn't actually downloading SBS videos, it is in converting the SBS url for the video into the actual URL, hence the comment on the youtube-dl forum, as in the OP.
Whether related or not, since about 10-14 days ago there has been a mismatch between the reported filesize, data & bit rates for SBS videos and what actually downloads. Since SBS has evidently downgraded the quality of their videos quite a bit by reducing the bit/data rates (as clear via watching them in a browser), this doesn't appear to be a yt-dlp bug so I've not posted an issue. I'm just noting it here as evidence of further changes at SBS and something that may or may not be related to the current issue.
Check the penultimate line of the patch code above to see where the contraband fragment comes from.
This mechanism is used to pass options with a URL from a user-facing extractor (eg SBS) to a video host extractor (eg ThePlatform) through a url
or url_transparent
extractor result.
For devs out of region, note that the API used by the extractor isn't geo-blocked, and so the various media links tested in the patch code can easily be inspected.
player_params['relatedItemsURL']
This wouldn't be a useful substitute to judge from the one I checked: just a JSON playlist of similar shows.
player_params['relatedItemsURL']
my patch doesn't touch that line, only the line above it. EDIT: it was added in https://github.com/yt-dlp/yt-dlp/commit/3c283a381e4f7a69bf57c3ea85aab3c85ce0e309 By the way, my patch is a workaround. I doubt that using the Android hls is an acceptable fix. What do you think?
I downloaded an episode of a daily show using the android work-around and compared it to previous 'normal' ones and it's about 1/3rd smaller in filesize. Whether that's due to the android 'version' or the SBS video downgrade noted above I can't say though. I can't say I noticed it being hugely lower quality when I watched it.
From the yt-dl thread, a new API should be used.
The htmlandroid
links include a sz
query parameter that implies the size would be ~ 550x300.
my patch doesn't touch that line, only the line above it. EDIT: it was added in 3c283a3
Sure, I guess the suspect alternative has never been reached.
Thanks for the "Workaround: use the android HLS" patch (above). I am relatively new to Python (Windows 10 and Ubuntu 22 via WSL). Is there a guide as to how I can implement this patch? I can use Git on both of my platforms. Thanks
Thanks for the "Workaround: use the android HLS" patch (above). I am relatively new to Python (Windows 10 and Ubuntu 22 via WSL). Is there a guide as to how I can implement this patch? I can use Git on both of my platforms. Thanks
Ok, it worked it out. Followed the official steps to regenerate the .exe from the source tree (after changing the sbs.py code) Worked! Thanks for the patch
And for those of us not up to python coding and compiling? :)
Confirming the issue - yt-dlp -v https://www.sbs.com.au/ondemand/watch/2170807363789 [debug] Command-line config: ['-v', 'https://www.sbs.com.au/ondemand/watch/2170807363789'] [debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8 [debug] yt-dlp version stable@2023.03.04 [392389b7d] [debug] Python 3.9.2 (CPython x86_64 64bit) - Linux-6.2.5-x64v3-xanmod1-x86_64-with-glibc2.31 (OpenSSL 1.1.1n 15 Mar 2022, glibc 2.31) [debug] exe versions: ffmpeg 4.3.5-0, ffprobe 4.3.5-0, rtmpdump 2.4 [debug] Optional libraries: Cryptodome-3.9.7, brotli-1.0.9, certifi-2022.09.24, mutagen-1.45.1, pyxattr-0.7.2, sqlite3-2.6.0, websockets-10.4 [debug] Proxy map: {} [debug] Loaded 1786 extractors [SBS] Extracting URL: https://www.sbs.com.au/ondemand/watch/2170807363789 [SBS] 2170807363789: Downloading JSON metadata [ThePlatform] Extracting URL: http://link.theplatform.com/s/Bgtm9B/gQ6Y3PTQpRro?feed=Video%20-%20Single&mbr=true&manifest=m3u&ord=6908369&policy=11929623&dfptag=sz%3D530x298%26iu%3D%2F4117%2Fvideo.factual.sbs.com.au%2Fsec30htmlweb%26ciu_szs%26impl%3Ds%26gdfp_req%3D1%26env%3Dvp%26output%3Dxml_vast2%26unviewed_position_start%3D1%26url%3Dwww.sbs.com.au%26description_url%3DSBS%26cust_params%3Dtype%253Dpreroll%26ad_rule%3D0%26cmsid%3D531%26nofb%3D1%26url%3Dhttp%253A%252F%252Fwww.sbs.com.au%252Fondemand%252Fvideo%252Fsingle%252F2170807363789%26description_url%3DSBS%26correlator%3D--ORD--%26vid%3D2170807363789&dfpmidtag=sz%3D530x298%26iu%3D%2F4117%2Fvideo.factual.sbs.com.au%2Fsec30midrollhtmlweb%26ciu_szs%26impl%3Ds%26gdfp_req%3D1%26env%3Dvp%26output%3Dxml_vast2%26unviewed_position_start%3D1%26url%3Dwww.sbs.com.au%26description_url%3DSBS%26cust_params%3Dtype%253Dmidroll%26ad_rule%3D0%26cmsid%3D531%26nofb%3D1%26url%3Dhttp%253A%252F%252Fwww.sbs.com.au%252Fondemand%252Fvideo%252Fsingle%252F2170807363789%26description_url%3DSBS%26correlator%3D--ORD--%26vid%3D2170807363789#__youtubedl_smuggle=%7B%22force_smil_url%22%3A+true%7D [ThePlatform] gQ6Y3PTQpRro: Downloading SMIL data [ThePlatform] gQ6Y3PTQpRro: Downloading m3u8 information WARNING: [ThePlatform] Failed to download m3u8 information: HTTP Error 403: Forbidden [ThePlatform] gQ6Y3PTQpRro: Downloading JSON metadata ERROR: [ThePlatform] gQ6Y3PTQpRro: No video formats found!; please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U Traceback (most recent call last): File "/usr/lib/python3/dist-packages/yt_dlp/YoutubeDL.py", line 1518, in wrapper return func(self, *args, **kwargs) File "/usr/lib/python3/dist-packages/yt_dlp/YoutubeDL.py", line 1615, in __extract_info return self.process_ie_result(ie_result, download, extra_info) File "/usr/lib/python3/dist-packages/yt_dlp/YoutubeDL.py", line 1727, in process_ie_result return self.process_ie_result( File "/usr/lib/python3/dist-packages/yt_dlp/YoutubeDL.py", line 1674, in process_ie_result ie_result = self.process_video_result(ie_result, download=download) File "/usr/lib/python3/dist-packages/yt_dlp/YoutubeDL.py", line 2615, in process_video_result self.raise_no_formats(info_dict) File "/usr/lib/python3/dist-packages/yt_dlp/YoutubeDL.py", line 1046, in raise_no_formats raise ExtractorError(msg, video_id=info['id'], ie=info['extractor'], yt_dlp.utils.ExtractorError: [ThePlatform] gQ6Y3PTQpRro: No video formats found!; please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U
I'm marking this as patch-available, since Dirkf has PRed https://github.com/ytdl-org/youtube-dl/pull/31880 in youtube-dl. I hope I am using the patch-available label correctly
And for those of us not up to python coding and compiling? :)
Lucky for all of you, me and Dirkf discovered that running yt-dlp "https://www.sbs.com.au/api/v3/video_smil?context=tv&id=VIDEOID"
works (replace VIDEOID with the number at the end of the sbs url)
Ok, I worked it out. Followed the official steps to regenerate the .exe from the source tree (after changing the sbs.py code) Worked! Thanks for the patch
By the way, you didn't need to, you can just open command prompt inside the repo and run yt-dlp.cmd URL
(the .cmd isn't strictly necessary, I don't think)
I tried the tailored url option and it repeatedly failed with several different shows:
yt-dlp -v https://www.sbs.com.au/api/v3/video_smil?context=tv&id=2176228931740 [debug] Command-line config: ['-v', 'https://www.sbs.com.au/api/v3/video_smil?context=tv'] [debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8 (No VT), error utf-8 (No VT), screen utf-8 (No VT) [debug] yt-dlp version stable@2023.03.04 [392389b7d] (win_exe) [debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-7-6.1.7601-SP1 (OpenSSL 1.1.1k 25 Mar 2021) [debug] exe versions: ffmpeg git-2020-06-10-9dfb19b, ffprobe git-2020-06-10-9dfb19b [debug] Optional libraries: Cryptodome-3.17, brotli-1.0.9, certifi-2022.12.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-10.4 [debug] Proxy map: {} [debug] Loaded 1786 extractors [generic] Extracting URL: https://www.sbs.com.au/api/v3/video_smil?context=tv [generic] video_smil?context=tv: Downloading webpage ERROR: [generic] None: Unable to download webpage: HTTP Error 400: Bad Request (caused by <HTTPError 400: 'Bad Request'>); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U File "yt_dlp\extractor\common.py", line 694, in extract File "yt_dlp\extractor\generic.py", line 2371, in _real_extract File "yt_dlp\extractor\common.py", line 839, in _request_webpage File "yt_dlp\extractor\common.py", line 821, in _request_webpage File "yt_dlp\YoutubeDL.py", line 3742, in urlopen File "urllib\request.py", line 531, in open File "urllib\request.py", line 640, in http_response File "urllib\request.py", line 569, in error File "urllib\request.py", line 502, in _call_chain File "urllib\request.py", line 649, in http_error_default urllib.error.HTTPError: HTTP Error 400: Bad Request 'id' is not recognized as an internal or external command, operable program or batch file.
EDIT: Using yt-dlp https://www.sbs.com.au/api/v3/video_smil?id=xxxxxxxx does work. Where xxxxxxxx is the ID.
That URL actually links to a SMIL manifest playable in many media players. Try putting it in your FF or Chrome-alike URL bar.
I tried the tailored url option and it repeatedly failed with several different shows:
Fixed! Sorry about that
I'm marking this as patch-available, since Dirkf has PRed ytdl-org/youtube-dl#31880 in youtube-dl. I hope I am using the patch-available label correctly
Yes, and solved-upstream
once it's merged signifying I can close the issue during the periodic upstream merge
Thanks for the video_smil?id
tip. Some powershell semi-pseudo-code :-) to take an SBS url ($url) and use this tip:
$url -match "/(\d+)$" > $null # get the ID of the item
$id = $Matches[1]
$newurl = "https://www.sbs.com.au/api/v3/video_smil?id=" + $id # build new URL
$jdata = yt-dlp --no-warnings -j $newurl | ConvertFrom-Json #get some info on the item
# title NOT in that json, so build our own filename and ext:
$newName = "SBS_" + $id + "_" + $jdata.format_id + "_" + $jdata.resolution + "." + $jdata.ext
# and finally ...
yt-dlp --no-warnings --quiet --progress $newurl -o $newName
Quick and dirty, I'm sure it can be cleaned up.
I tried the tailored url option and it repeatedly failed with several different shows:
I note your fixed comment, but for benefit of other readers not sure what the fix was, it looks like URL must be in quotes, else it's split at the ampersand and you get the Bad Request 'id' is not recognized as an internal or external command, operable program or batch file
message.
Vidiot720, just leave out "context=tv&" and it works fine. That is, use: https://www.sbs.com.au/api/v3/video_smil?id=xxxxxxxx where xxxxxxxx is the ID.
I tried it on several shows and it worked fine.
Confirming the id swapperoo works - just change the filename or -o
I tried it on several shows and it worked fine.
Thanks. I thought from the earlier discussion on the context=tv
parameter on the youtube-dl issue that it helped with login bypass, and leaves out the ads. I assume you're still getting the highest quality and no ads in the resulting file.
I haven't tested this method directly as I'm running dirkf's patch to sbs.py to test with yt-dlp. It's working OK; there's:
self._sort_formats(formats)
that's deprecated on yt-dlp.an open question about metadata handling for episodes, and
Ask away - oh, I see you did.
a call to self._sort_formats(formats) that's deprecated on yt-dlp.
But necessary in yt-dl ATM. Feel free to delete it for yt-dlp.
Feel free to delete it for yt-dlp.
Thanks, understood; I was really just flagging this for downstream (here).
Leaving out context=tv I still get the highest available quality by default and no adverts. There was a discussion elsewhere last year relating Australia's other non-commercial streaming TV service that was introducing login requirements and from the PR material they released, their software would query the viewer's platform to determine which browser and os you were using. If it could not get an answer (which would be the case for TVs or youtube-dl & it's forks) then the requirement for login creds would be bypassed. Since leaving out context=tv works, I assume something similar may be the case here as well. SBS does require you to login to watch in a browser.
Just another SBS quirk - on an SBS forum some have noted that using "https://www.sbs.com.au/api/v3/video_smil?id=xxxxxxxx" the downloaded subtitles are incomplete and/or the timings are noticeably off.
I note your fixed comment, but for benefit of other readers not sure what the fix was, it looks like URL must be in quotes, else it's split at the ampersand and you get the
Bad Request 'id' is not recognized as an internal or external command, operable program or batch file
message.
Yes. URLs containing an ambersand (&) must always be quoted
just leave out "context=tv&" and it works fine
I'll look into it more later, but I don't think that's necessarily an improvement https://github.com/ytdl-org/youtube-dl/issues/31841#issuecomment-1474717030
Thanks. I thought from the earlier discussion on the
context=tv
parameter on the youtube-dl issue that it helped with login bypass
It's useful for bypassing login on the api, not on the actual url that the api gives us (which we've found the pattern for now)
and leaves out the ads
That's correct, although I don't know how necessary it is or whether ads are automatically skipped (see answer two above)
Feel free to delete it for yt-dlp.
Correct me if I'm wrong, but I don't think a PR porting a yt-dl commit would be helpful
Since leaving out context=tv works, I assume something similar may be the case here as well.
Interesting, I'll look into this. Thanks for the info!
Just another SBS quirk - on an SBS forum some have noted that using "https://www.sbs.com.au/api/v3/video_smil?id=xxxxxxxx" the downloaded subtitles are incomplete and/or the timings are noticeably off.
Can you please send a link to that forum, if it's public?
Here's the forum link. It's public to view, but you need to register to post. It's a forum devoted to downloading from SBS.
the downloaded subtitles are incomplete and/or the timings are noticeably off.
I wouldn't be surprised, given the SMIL contains the video in segments, with ads in between. I've never tried to get the subtitles; there's usually a message like:
WARNING: [SBS] Ignoring subtitle tracks found in the SMIL manifest;
and you don't get them either in the downloaded video file, or separately. That's still true with the latest patch in testing.
Although the 'new' API reports higher TBRs than the old, it's still downloading 1.6 Mb/s streams at highest quality for videos posted before the old API stopped working for newer videos. The 1.6 Mb/s was a fixed total bit-rate, whereas the new encodings appear to be VBR averaging from ~960 - ~1290 Kb/s, so perceived quality may be just as good, if adapted to content complexity. EDIT: a fair comparison could be made from looking at Ep 5 of The Walk-in (av. ~920 Kb/s) vs. earlier episodes, or similarly for S 2 Ep 6 of Bloodlands (~1025 Kb/s) vs. earlier eps at 1.6Mb/s as these both straddle the changeover to VBR.
Correct me if I'm wrong, but I don't think a PR porting a yt-dl commit would be helpful
I wasn't sure on the process pukkandan referred to as "merging upstream", above. If yt-dlp takes the patch as-is from yt-dl, a further patch needed to remove the deprecated call to _sort_formats
.
FWIW if you run this regex on the links of type "https://www.sbs.com.au/ondemand/tv-series/the-abyss-rise-and-fall-of-the-nazis/season-1/the-abyss-rise-and-fall-of-the-nazis-s1-ep1/2166627395813" ...
^.*?//.*?/.*?/.*?/(.*?)/
...that first Group should be the Title. You can then do a Replace("-", " ")
to tidy it up. Still new to Regex but this seems to work for me in PowerShell
$url -match "^.*?//.*?/.*?/.*?/(.*?)/" > $null #don't show True or False trick
$title = $Matches[1].Replace("-", " ")
# the abyss rise and fall of the nazis
Can then use this in building filename for use in -o
in yt-dlp. Swap the -match regex with...
/.*/(.*?)/\d+$
...and it will give the abyss rise and fall of the nazis s1 ep1
Hey just noted why some subtitles may not be being downloaded..or at least the 'manual' way I do it.
I normally find them under the json 'key' .subtitles.en
but with The Abyss - Rise and Fall of the Nazis ep 5 that key is .subtitles.eng
So looking for .en - which is what it is for e01 thru 04 - gives nothing.
Just another SBS quirk - on an SBS forum some have noted that using "https://www.sbs.com.au/api/v3/video_smil?id=xxxxxxxx" the downloaded subtitles are incomplete and/or the timings are noticeably off.
No one mentioned anything about the timings as far as I can see. As for the subtitle issue, use --sub-format dfxp. I will check now whether that's required for Dirkf's pr, and if yes I will post a message about it.
WARNING: [SBS] Ignoring subtitle tracks found in the SMIL manifest;
and you don't get them either in the downloaded video file, or separately. That's still true with the latest patch in testing.
Even in the https://www.sbs.com.au/api/v3/video_smil?context=tv&id=xxxxxxxx workaround?
Although the 'new' API reports higher TBRs than the old, it's still downloading 1.6 Mb/s streams at highest quality for videos posted before the old API stopped working for newer videos. The 1.6 Mb/s was a fixed total bit-rate, whereas the new encodings appear to be VBR averaging from ~960 - ~1290 Kb/s, so perceived quality may be just as good, if adapted to content complexity. EDIT: a fair comparison could be made from looking at Ep 5 of The Walk-in (av. ~920 Kb/s) vs. earlier episodes, or similarly for S 2 Ep 6 of Bloodlands (~1025 Kb/s) vs. earlier eps at 1.6Mb/s as these both straddle the changeover to VBR.
Is the quality lower then on the website?
I wasn't sure on the process pukkandan referred to as "merging upstream", above. If yt-dlp takes the patch as-is from yt-dl, a further patch needed to remove the deprecated call to
_sort_formats
.
Potentially. My reply wasn't directed at you though
Re the timings, Oblong wrote on the SBS forum: "I've noticed another issue: subtitles incomplete for "The Abyss". Ep 8 is ok. Ep 9 subtitles stop at 00:01:14. They are complete when viewing the episode with a browser. Ep 10 also stops short. For Ep 9, video_smil.ENG.srt is downloaded, 1.3 kB. When looking in the file the timings just stop, with no error message..."
I wouldn't be surprised, given the SMIL contains the video in segments, with ads in between. I've never tried to get the subtitles; there's usually a message like: WARNING: [SBS] Ignoring subtitle tracks found in the SMIL manifest; and you don't get them either in the downloaded video file, or separately. That's still true with the latest patch in testing.
What's happening here is that yt-dl doesn't (yet) have the methods that get subtitles while extracting manifests.
In yt-dlp, try changing this line
- formats = self._extract_smil_formats(smil_url, video_id, fatal=False) or []
+ formats, subtitles = self._extract_smil_formats_and_subtitles(smil_url, video_id, fatal=False) or ([], {})
and adding 'subtitles': subtitles,
to the return value.
n yt-dlp, try changing this line...
Thanks, dirkf; can confirm with these changes and --sub-format "srt" --write-subs
added, the subs are downloaded and playback in sync OK for VLC for the whole program; tested with The Walk-In Ep 5; Trying bnw42's example, only receive 1.31 KiB in subs file; suspect this is an issue at SBS's end, rather than an issue with this workaround via SMIL.
Even in the https://www.sbs.com.au/api/v3/video_smil?context=tv&id=xxxxxxxx workaround?
Haven't tried this workaround; having applied the upstream patch successfully, I've been testing that.
Is the quality lower then on the website?
Sorry, can't speak to that; I can't use the website, hence being a yt-dlp user.
Potentially. My reply wasn't directed at you though
Still not clear on what it all meant; not to worry, will leave this question on adapting(?) dirkf's PR to yt-dlp to the experts. Just flagging additional changes for yt-dlp downstream as dirkf advised to:
"I've noticed another issue: subtitles incomplete for "The Abyss". Trying bnw42's example, only receive 1.31 KiB in subs file
use --sub-format dfxp
Still not clear on what it all meant
Nothing, I was just telling people not to open a PR porting the extractor from youtube-dl
Just flagging additional changes for yt-dlp downstream as dirkf advised to:
yeah, based on that I guess the merge checklist is:
use --sub-format dfxp
OK, this appears to be an issue at SBS's end, where the presence in a subtitle of ¾
interrupts the SRT generation, but does not affect the dfxp. SBS don't have a great track record handling special characters and mojibake issues are frequent, so it doesn't surprise me.
Sorry, it was not clear earlier that using dxfp was being suggested as the work-around for this issue in particular, say as opposed to a known issue with yt-dl/yt-dlp's handling of sub formats. --convert-subs doesn't seem to be in the readme for either project.
I guess the merge checklist
Not to forget the metadata issues, at least to put back the previous behaviour for setting title
.
SBS don't have a great track record handling special characters
Which subtitle format is more reliable? I assume dfxp, since it's used by the website (which proxies it through a subtitle conversion service that seems to be self-hosted by sbs)
Not to forget the metadata issues, at least to put back the previous behaviour for setting
title
.
Sorry, I should have been more clear. My checklist was for porting the pr to yt-dlp. Is the metadata issue going to be fixed in youtube-dl?
Which subtitle format is more reliable? I assume dfxp,
Based on the example so far, unclear. Looking again at the example cited by bnw42, the XML encoding is reported as 'UTF-16'
, which the SubtitlesConverter errors out on with message:
xml.etree.ElementTree.ParseError: encoding specified in XML declaration is incorrect: line 1, column 30
. For what it's worth, the downloaded file appears to be UTF-8
, but not sure if that's unmolested by the downloader. As downloaded, VLC won't load the dfxp with "Add Subtitle...", whether or not the coding is corrected in the downloaded file, either by changing the encoding report in the file to UTF-8
, or converting the file to UTF-16
. EDIT: OK, you have to change the file extension for VLC to recognise it to .ttml
, as well as doing the encoding correction or conversion. It then kind of works, although the line formatting in VLC is poor.
If yt-dl/dlp can handle the conversion, it might be nice to set dfxp as the default to work-around SBS's poor conversion service, if that is the issue. Not sure if the conversion to SRT can be handled if requested via --sub-format "srt"
implicitly (i.e. behave as if --sub-format "dfxp" --convert-subs "srt"
was used), particularly if there's an issue with the dfxp anyway, which is maybe what is tripping up SBS's own conversion service.
My checklist was for porting the pr to yt-dlp. Is the metadata issue going to be fixed in youtube-dl?
Sorry, yes. I hope this will be addressed in yt-dl.
there's an issue with the dfxp anyway
OK, verified that --sub-format "dfxp" --convert-subs "srt"
will work without error, but only with a kludge introduced into line 4096 of utils.py,
dfxp_data = dfxp_data.replace(b'encoding=\'UTF-16\'', b'encoding=\'UTF-8\'')
Solving the original encoding problem for downloaded dfxp subs for SBS is probably a topic for another issue, so won't add more about it here.
I am encountering the same issues with SBS On Demand today https://www.sbs.com.au/ondemand/watch/1837973059920
NicGeoLaw, Please read the above posts. The issue has been identified and a fix will hopefully be appearing in a future update. Until then use the recommended workaround, substituting the video ID into the following (replacing xxxxxxxx). The video ID is the number at the end of the webpage url for the show you want.
yt-dlp https://www.sbs.com.au/api/v3/video_smil?id=xxxxxxxxx
As of monday evening this was still working.
Just to add to what @bnw42 said above, this may help. I'm not a developer but do dabble in PowerShell etc:
# url = original one, eg $url = "https://www.sbs.com.au/ondemand/watch/2172633667829"
# grab the ID
$url -match "/(\d+)$" > $null
$id = $Matches[1]
# build new url and use that
$newurl = "https://www.sbs.com.au/api/v3/video_smil?id=" + $id
yt-dlp $newurl
Works in .ps1 script too
PR #6839 now in review, vastly improved due to @bashonly, particularly for geo bypass handling, and with some extra improvements to metadata handling for episode
and is_live
, as well as plenty of coding style tweaks. Has been tested here, but not as much as the earlier PR from @dirkf got. Big thanks to the reviewers for putting up with the python and yt-dlp noobity on my end!
DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
Checklist
Region
Australia
Provide a description that is worded well enough to be understood
All attempts to download videos from the SBS Ondemand website are now failing with similar errors. A bug report has been posted to the youtube-dl site and the conclusion there was that SBS has "made updates to their streaming platform video obfuscation."
Downloading with yt-dlp does work using the master manifest obtained using a browser addon video stream detector whilst playing the video in a browser.
Note SBS is geoblocked to Australia.
Provide verbose output that clearly demonstrates the problem
yt-dlp -vU <your command line>
)'verbose': True
toYoutubeDL
params instead[debug] Command-line config
) and insert it belowComplete Verbose Output