Closed antdude closed 3 years ago
i just used the Firefox addon called the stream detector to successfully grab the master m3u8 file and downloaded this video - https://www.bbc.com/reel/video/p08yxrlb/why-our-dreams-could-be-the-key-to-time-travel
youtube-dl broke for bbc.com & bbc.co.uk videos as early as v2019.11.28 onwards. ie. back in Nov 2019. (Yeah, I was taking notes for every version until 2020 Q1 when I gave up hoping it would be fixed.)
It was also broken for some audio at bbc.co.uk/sounds, but the latest v2020.11.21.1 now seems to work okay for that domain, although I haven't tested every URL.
For BBC Reel (but not non-Reel) videos, previously one could work around the no suitable InfoExtractor
error by specifying the Programme ID (PID) instead -- or at least until sometime in early 2020 (still okay in Jan/Feb 2020).
Eg. For https://www.bbc.com/reel/video/p08yxrlb/why-our-dreams-could-be-the-key-to-time-travel
And youtube-dl https://www.bbc.co.uk/programmes/p08yxrlb
would have been able to fetch the video (back in Jan/Feb 2020 & earlier). However, with v2020.11.21.1, it now shows ERROR: No video formats found
.
I also tried youtube-dl https://www.bbc.com/programmes/p08yxrlb
-- but it shows ERROR: no suitable InfoExtractor for URL https://www.bbc.co.uk/programmes/None
.
As such, the latest youtube-dl is totally broken for all BBC videos, unless perhaps one resorts to using 3rd-party manual extraction methods.
@hairycactus :
As you say, for
https://www.bbc.com/reel/video/p08yxrlb/why-our-dreams-could-be-the-key-to-time-travel
you'd have to manipulate it to
https://www.bbc.co.uk/programmes/p08yxrlb
for the bbc.co.uk
InfoExtractor (IE) to recognise it...
For pid=p08yxrlb
(included in the clip's URI), yt-dl correctly retrieves that vpid=p08yxrld
, as can be seen by
youtube-dl -F "https://www.bbc.co.uk/programmes/p08yxrlb" -v
=>
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-F', 'https://www.bbc.co.uk/programmes/p08yxrlb', '
-v']
[debug] Encodings: locale cp1253, fs mbcs, out cp737, pref cp1253
[debug] youtube-dl version 2020.11.24
[debug] Python version 3.4.4 (CPython) - Windows-Vista-6.0.6003-SP2
[debug] exe versions: ffmpeg N-97309-g4e0cf81b49, ffprobe N-97309-g4e0cf81b49, p
hantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[bbc.co.uk] p08yxrlb: Downloading video page
[bbc.co.uk] p08yxrld: Downloading media selection XML
ERROR: No video formats found; please report this issue on https://yt-dl.org/bug
. Make sure you are using the latest version; type youtube-dl -U to update. B
e sure to call youtube-dl with the --verbose flag and include its complete outpu
t.
<redacted>
However, as instructed by the code referenced above, that vpid
string is only tried with the first mediaselector URI, the one with mediaset=iptv-all:
https://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/p08yxrld
which doesn't yield any media streams info (only subs/captions info) 😠; however, and this is a yt-dl bug in this case, the vpid
string isn't tried with the second mediaselector URI (mediaset=pc), which is actually the one that does return media streams info:
https://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/p08yxrld
But BBC Reel video-clips constitute edge cases for the bbcIE: They are (usually) globally available (non-geofenced), served from the bbc.com
domain, which the bbcIE does not officially support; bbcIE focuses mainly on video content from BBC iPlayer (geofenced) and audio content from BBC Sounds (partly geofenced, overseas locations are served lower bitrates), not random bbc.co* clips...
Workaround: Unfortunately, I don't "speak" Python, so can not offer a PR to fix this... Should you wish to fetch above BBC Reel video, you could comment out line 56 of provided code snippet inside bbc.py
# 'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s',
recompile yt-dl (or invoke directly from source) and issue:
youtube-dl "https://www.bbc.co.uk/programmes/p08yxrlb"
=>
[bbc.co.uk] p08yxrlb: Downloading video page
[bbc.co.uk] p08yxrld: Downloading media selection XML
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[dashsegments] Total fragments: 103
[download] Destination: BBC - Could your dreams predict the future-p08yxrld.fstr
eam-nonuk-pc_streaming_concrete_combined_sd_mf_limelight_world_dash_https-video=
5070000.mp4
[download] 11.1% of ~198.10MiB at 816.44KiB/s ETA 04:08
Another workaround would be to move away completely from the deprecated mediaselector/5 API and change to the current mediaselector/6 one; however, v6 produces, by default, JSON-formatted content, while the existing parser inside bbc.py
expects XML-formatted one; you can still force request compatible XML-formatted response by appending /format/xml
:
- 'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s',
- 'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s',
+ 'http://open.live.bbc.co.uk/mediaselector/6/select/version/2.0/mediaset/iptv-all/vpid/%s/format/xml',
+ 'http://open.live.bbc.co.uk/mediaselector/6/select/version/2.0/mediaset/pc/vpid/%s/format/xml',
[bbc.co.uk] p08yxrlb: Downloading video page
[bbc.co.uk] p08yxrld: Downloading media selection XML
[bbc.co.uk] p08yxrld: Downloading m3u8 information
[bbc.co.uk] p08yxrld: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 403: Forbidden
[bbc.co.uk] p08yxrld: Downloading m3u8 information
[bbc.co.uk] p08yxrld: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 403: Forbidden
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[bbc.co.uk] p08yxrld: Downloading m3u8 information
[bbc.co.uk] p08yxrld: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 403: Forbidden
[bbc.co.uk] p08yxrld: Downloading m3u8 information
[bbc.co.uk] p08yxrld: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 403: Forbidden
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[dashsegments] Total fragments: 103
[download] Destination: BBC - Could your dreams predict the future-p08yxrld.f_de
precated__mf_limelight-video=5070000-1.mp4
[download] 2.2% of ~171.52MiB at 960.82KiB/s ETA 04:04
This has been fixed for AGES by my pull request (almost a year now) which the youtube-dl maintenance team is refusing to merge https://github.com/ytdl-org/youtube-dl/pull/23415
Checklist
Question
WRITE QUESTION HERE Am I reading correctly that updated youtube-dl still can't download videos from bbc.com's web site? https://github.com/ytdl-org/youtube-dl/issues?q=is%3Aissue+is%3Aopen+bbc.com shows https://github.com/ytdl-org/youtube-dl/issues/23232. Results seem to be different as shown below:
$ youtube-dl https://www.bbc.com/reel/video/p08yxrlb/why-our-dreams-could-be-the-key-to-time-travel [bbc] why-our-dreams-could-be-the-key-to-time-travel: Downloading webpage ERROR: no suitable InfoExtractor for URL https://www.bbc.co.uk/programmes/None
Or is this a different issue that I need to report as a new bug issue?
Thank you for reading and hopefully answering soon. :)