ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.96k stars 10.01k forks source link

BBC weather #27780

Closed SKWDiesel1 closed 1 year ago

SKWDiesel1 commented 3 years ago

Checklist

Verbose log

PASTE VERBOSE LOG HERE

youtube-dl --verbose https://www.bbc.co.uk/weather/features/55581056 [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: ['--verbose', 'https://www.bbc.co.uk/weather/features/55581056'] [debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8 [debug] youtube-dl version 2021.01.08 [debug] Python version 3.9.1 (CPython) - Linux-5.4.88-1-lts-x86_64-with-glibc2.32 [debug] exe versions: ffmpeg 4.3.1, ffprobe 4.3.1, rtmpdump 2.4 [debug] Proxy map: {} [bbc] 55581056: Downloading webpage ERROR: Unable to extract playlist data; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. Traceback (most recent call last): File "/usr/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 803, in wrapper return func(self, *args, **kwargs) File "/usr/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 824, in __extract_info ie_result = ie.extract(url) File "/usr/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 532, in extract ie_result = self._real_extract(url) File "/usr/lib/python3.9/site-packages/youtube_dl/extractor/bbc.py", line 1174, in _real_extract self._search_regex( File "/usr/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 1010, in _search_regex raise RegexNotFoundError('Unable to extract %s' % _name) youtube_dl.utils.RegexNotFoundError: Unable to extract playlist data; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Description

WRITE DESCRIPTION HERE This error is reported when trying to extract the video report from the page... https://www.bbc.co.uk/weather/features/5558105

georgeahill commented 3 years ago

Hi! Thanks for creating an issue. I've just checked, and I get this issue too. I think this might be a duplicate of #14168 which hasn't had a response as of yet - it looks like either a bug in the BBC extractor, a lack of proper support for BBC Weather, or some change in BBC's delivery system. This may be fixed by #23415 which hasn't been merged yet.

Vangelis66 commented 3 years ago

Manual workaround for fetching to disk the "Weather for the Week Ahead" clip found on:

https://www.bbc.com/weather/features/55581056

  1. Use your browser to inspect Page Source
  2. Search for data-parent-pid string
  3. Note down its value, which, in this case, is p093xhx6
  4. Manually reformulate the original URI to https://www.bbc.co.uk/programmes/p093xhx6
  5. Feed yt-dl the above URI
  6. Profit:

youtube-dl -F "https://www.bbc.co.uk/programmes/p093xhx6" =>

[bbc.co.uk] p093xhx6: Downloading video page
[bbc.co.uk] p093xhxl: Downloading media selection JSON
[bbc.co.uk] p093xhxl: Downloading m3u8 information
[bbc.co.uk] p093xhxl: Downloading m3u8 information
[bbc.co.uk] p093xhxl: Downloading MPD manifest
[bbc.co.uk] p093xhxl: Downloading MPD manifest
[bbc.co.uk] p093xhxl: Downloading m3u8 information
[bbc.co.uk] p093xhxl: Downloading m3u8 information
[bbc.co.uk] p093xhxl: Downloading MPD manifest
[bbc.co.uk] p093xhxl: Downloading MPD manifest
[info] Available formats for p093xhxl:
format code                        extension  resolution note
mf_akamai-audio_eng_1=128000-0     m4a        audio only [en] DASH audio  128k , m4a_dash container, mp4a.40.2 (48000Hz)
mf_akamai-audio_eng_1=128000-1     m4a        audio only [en] DASH audio  128k , m4a_dash container, mp4a.40.2 (48000Hz)
mf_limelight-audio_eng_1=128000-0  m4a        audio only [en] DASH audio  128k , m4a_dash container, mp4a.40.2 (48000Hz)
mf_limelight-audio_eng_1=128000-1  m4a        audio only [en] DASH audio  128k , m4a_dash container, mp4a.40.2 (48000Hz)
mf_akamai-video=827000-0           mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_akamai-video=827000-1           mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_limelight-video=827000-0        mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_limelight-video=827000-1        mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_akamai-video=1570000-0          mp4        704x396    DASH video 1570k , mp4_dash container, avc3.64001F, 50fps, video only
mf_akamai-video=1570000-1          mp4        704x396    DASH video 1570k , mp4_dash container, avc3.64001F, 50fps, video only
mf_limelight-video=1570000-0       mp4        704x396    DASH video 1570k , mp4_dash container, avc3.64001F, 50fps, video only
mf_limelight-video=1570000-1       mp4        704x396    DASH video 1570k , mp4_dash container, avc3.64001F, 50fps, video only
mf_akamai-video=2812000-0          mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_akamai-video=2812000-1          mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_limelight-video=2812000-0       mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_limelight-video=2812000-1       mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_akamai-video=5070000-0          mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_akamai-video=5070000-1          mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_limelight-video=5070000-0       mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_limelight-video=5070000-1       mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_akamai-1013-0                   mp4        704x396    1013k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.2@128k
mf_akamai-1013-1                   mp4        704x396    1013k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.2@128k
mf_limelight-1013-0                mp4        704x396    1013k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.2@128k
mf_limelight-1013-1                mp4        704x396    1013k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.2@128k
mf_akamai-1800-0                   mp4        704x396    1800k , avc1.64001F@1570k, 50.0fps, mp4a.40.2@128k
mf_akamai-1800-1                   mp4        704x396    1800k , avc1.64001F@1570k, 50.0fps, mp4a.40.2@128k
mf_limelight-1800-0                mp4        704x396    1800k , avc1.64001F@1570k, 50.0fps, mp4a.40.2@128k
mf_limelight-1800-1                mp4        704x396    1800k , avc1.64001F@1570k, 50.0fps, mp4a.40.2@128k
mf_akamai-3117-0                   mp4        960x540    3117k , avc1.64001F@2812k, 50.0fps, mp4a.40.2@128k
mf_akamai-3117-1                   mp4        960x540    3117k , avc1.64001F@2812k, 50.0fps, mp4a.40.2@128k
mf_limelight-3117-0                mp4        960x540    3117k , avc1.64001F@2812k, 50.0fps, mp4a.40.2@128k
mf_limelight-3117-1                mp4        960x540    3117k , avc1.64001F@2812k, 50.0fps, mp4a.40.2@128k
mf_akamai-5510-0                   mp4        1280x720   5510k , avc1.640020@5070k, 50.0fps, mp4a.40.2@128k
mf_akamai-5510-1                   mp4        1280x720   5510k , avc1.640020@5070k, 50.0fps, mp4a.40.2@128k
mf_limelight-5510-0                mp4        1280x720   5510k , avc1.640020@5070k, 50.0fps, mp4a.40.2@128k
mf_limelight-5510-1                mp4        1280x720   5510k , avc1.640020@5070k, 50.0fps, mp4a.40.2@128k (best)
SKWDiesel1 commented 3 years ago

I have found that the data-parent-pid is shown if you right click on the video. This saves having to inspect the page source.

Thanks for the feedback and work around.

Vangelis66 commented 3 years ago

@SKWDiesel1 last wrote:

I have found that the data-parent-pid is shown if you right click on the video. This saves having to inspect the page source.

Strictly speaking/being pedantic, etc., what you write is NOT exact... :-1: The clip's PID is only recoverable via inspecting Page Source and, as posted already, is :

data-parent-pid="p093xhx6"

The embedded player's HTML5 context menu (which involves first starting video playback, something not always wanted...) displays the VersionPID (aka vpid - not easily copied from there) of the clip, also found inside Page Source in two instances as:

data-vpid="p093xhxl"
(redacted)
"versionPid":"p093xhxl"

But, pid(=p093xhx6) != vpid(=p093xhxl)

yt-dl must be fed the PID string (and this is a BBC-wide applicable advice), but, luckily for you, what really happens is a silent auto-redirection performed by bbc.co.uk from a vpid to a pid URI:

https://www.bbc.co.uk/programmes/p093xhxl =>

https://www.bbc.co.uk/programmes/p093xhx6

(you can check/verify the redirection in your browser...) Thus, youtube-dl "https://www.bbc.co.uk/programmes/p093xhxl" simply just works, too... :stuck_out_tongue_winking_eye:

SKWDiesel1 commented 3 years ago

I like pedantic as it usually means correct! It is fortunate for me/us that the redirection works!

Thanks again.

dirkf commented 3 years ago

This weather page stashes the page details as JSON in a JS call to Morph.setPayload(), as seems to be typical in non-iPlayer pages. While this pattern is found by the extractor, the current logic may not find the correct instance of the pattern and doesn't capture the required data from the JSON as currently served.

See PR #28577

Vangelis66 commented 1 year ago

... FWIW, the BBC weather clip present in the log of the OP can now be fetched with the git-master version of youtube-dl (or, even, with the "overhauled" bbcIE found here 😜 ):

yt-dl -vF "https://www.bbc.com/weather/features/55581056" => 

[bbc] 55581056: Downloading webpage
[bbc] 55581056: Extracting from __INITIAL_DATA__
[bbc] p093xhxl: Downloading media selection JSON
[bbc] p093xhxl: Downloading m3u8 information
[bbc] p093xhxl: Downloading m3u8 information
[bbc] p093xhxl: Downloading m3u8 information
[bbc] p093xhxl: Downloading m3u8 information
[bbc] p093xhxl: Downloading MPD manifest
[bbc] p093xhxl: Downloading MPD manifest
[bbc] p093xhxl: Downloading MPD manifest
[bbc] p093xhxl: Downloading MPD manifest
[bbc] p093xhxl: Downloading media selection JSON
[download] Downloading playlist: Weather for the Week Ahead
[bbc] playlist Weather for the Week Ahead: Collected 1 video ids (downloading 1of them)
[download] Downloading video 1 of 1
[info] Available formats for p093xhxl:
format code                      extension  resolution note
mf_akamai-audio_eng=96000-0      m4a        audio only [en] DASH audio   96k , m4a_dash container, mp4a.40.5 (48000Hz)
mf_akamai-audio_eng=96000-1      m4a        audio only [en] DASH audio   96k , m4a_dash container, mp4a.40.5 (48000Hz)
mf_cloudfront-audio_eng=96000-0  m4a        audio only [en] DASH audio   96k , m4a_dash container, mp4a.40.5 (48000Hz)
mf_cloudfront-audio_eng=96000-1  m4a        audio only [en] DASH audio   96k , m4a_dash container, mp4a.40.5 (48000Hz)
mf_akamai-video=281000-0         mp4        384x216    DASH video  281k , mp4_dash container, avc3.42C015, 25fps, video only
mf_akamai-video=281000-1         mp4        384x216    DASH video  281k , mp4_dash container, avc3.42C015, 25fps, video only
mf_cloudfront-video=281000-0     mp4        384x216    DASH video  281k , mp4_dash container, avc3.42C015, 25fps, video only
mf_cloudfront-video=281000-1     mp4        384x216    DASH video  281k , mp4_dash container, avc3.42C015, 25fps, video only
mf_akamai-video=437000-0         mp4        512x288    DASH video  437k , mp4_dash container, avc3.4D4015, 25fps, video only
mf_akamai-video=437000-1         mp4        512x288    DASH video  437k , mp4_dash container, avc3.4D4015, 25fps, video only
mf_cloudfront-video=437000-0     mp4        512x288    DASH video  437k , mp4_dash container, avc3.4D4015, 25fps, video only
mf_cloudfront-video=437000-1     mp4        512x288    DASH video  437k , mp4_dash container, avc3.4D4015, 25fps, video only
mf_akamai-video=827000-0         mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_akamai-video=827000-1         mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_cloudfront-video=827000-0     mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_cloudfront-video=827000-1     mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_akamai-video=1604000-0        mp4        960x540    DASH video 1604k , mp4_dash container, avc3.64001F, 25fps, video only
mf_akamai-video=1604000-1        mp4        960x540    DASH video 1604k , mp4_dash container, avc3.64001F, 25fps, video only
mf_cloudfront-video=1604000-0    mp4        960x540    DASH video 1604k , mp4_dash container, avc3.64001F, 25fps, video only
mf_cloudfront-video=1604000-1    mp4        960x540    DASH video 1604k , mp4_dash container, avc3.64001F, 25fps, video only
mf_akamai-video=2812000-0        mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_akamai-video=2812000-1        mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_cloudfront-video=2812000-0    mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_cloudfront-video=2812000-1    mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_akamai-video=5070000-0        mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_akamai-video=5070000-1        mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_cloudfront-video=5070000-0    mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_cloudfront-video=5070000-1    mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_akamai-400-0                  mp4        384x216     400k , avc1.42C015@ 281k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-400-1                  mp4        384x216     400k , avc1.42C015@ 281k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-400-0              mp4        384x216     400k , avc1.42C015@ 281k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-400-1              mp4        384x216     400k , avc1.42C015@ 281k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-565-0                  mp4        512x288     565k , avc1.4D4015@ 437k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-565-1                  mp4        512x288     565k , avc1.4D4015@ 437k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-565-0              mp4        512x288     565k , avc1.4D4015@ 437k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-565-1              mp4        512x288     565k , avc1.4D4015@ 437k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-979-0                  mp4        704x396     979k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-979-1                  mp4        704x396     979k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-979-0              mp4        704x396     979k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-979-1              mp4        704x396     979k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-1802-0                 mp4        960x540    1802k , avc1.64001F@1604k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-1802-1                 mp4        960x540    1802k , avc1.64001F@1604k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-1802-0             mp4        960x540    1802k , avc1.64001F@1604k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-1802-1             mp4        960x540    1802k , avc1.64001F@1604k, 25.0fps, mp4a.40.5@ 96k (best)
[download] Finished downloading playlist: Weather for the Week Ahead

NB: For some peculiar (?) reason, the (720|540)p50 encodes are being offered solely via DASH; HLS can go as high as 540p25, only; overseas location, BTW...

EDIT: (720|540)p50 encodes over HLS will appear when issuing

yt-dl -vF "https://www.bbc.co.uk/programmes/p093xhx6"

instead of

yt-dl -vF "https://www.bbc.com/weather/features/55581056"

Also, if you compare today's log with the one from more than two years ago, it seems the Beeb have ditched the Limelight CDNs for the Cloudfront ones...