ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.34k stars 9.95k forks source link

[rtve.es:alacarta] Base URL changed while renaming service #29522

Open fcaneva-arg opened 3 years ago

fcaneva-arg commented 3 years ago

Checklist

Verbose log

$ youtube-dl -v 'https://www.rtve.es/play/videos/saber-y-ganar/edicion-fin-semana-11-07-21/5977613/'
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', 'https://www.rtve.es/play/videos/saber-y-ganar/edicion-fin-semana-11-07-21/5977613/']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.06.06
[debug] Python version 3.8.10 (CPython) - Linux-5.4.0-66-generic-x86_64-with-glibc2.29
[debug] exe versions: ffmpeg 4.2.4, ffprobe 4.2.4, rtmpdump 2.4
[debug] Proxy map: {}
[generic] 5977613: Requesting header
WARNING: Falling back on generic information extractor.
[generic] 5977613: Downloading webpage
[generic] 5977613: Extracting information
ERROR: Unsupported URL: https://www.rtve.es/play/videos/saber-y-ganar/edicion-fin-semana-11-07-21/5977613/
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/youtube_dl/YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/youtube_dl/YoutubeDL.py", line 836, in __extract_info
    ie_result = ie.extract(url)
  File "/usr/local/lib/python3.8/dist-packages/youtube_dl/extractor/common.py", line 534, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/lib/python3.8/dist-packages/youtube_dl/extractor/generic.py", line 3520, in _real_extract
    raise UnsupportedError(url)
youtube_dl.utils.UnsupportedError: Unsupported URL: https://www.rtve.es/play/videos/saber-y-ganar/edicion-fin-semana-11-07-21/5977613/

Description

Radio Televisión Española (RTVE, Spanish for Spanish Radio & Television Company) is changing the name of his site "A la carta" (On demand) to "RTVE Play", also modifying their URLs for accessing contents. The URL changed slightly, modifying the folder alacarta for play, making the current extractor version unable to continue processing the page and falling back to the generic extractor (which also fail).

Old URLs had the following structure (for example, the last emission of "Saber y Ganar" as of today, July 11 2021): https://www.rtve.es/alacarta/videos/saber-y-ganar/edicion-fin-semana-11-07-21/5977613/ While new URLs have the following structure: https://www.rtve.es/play/videos/saber-y-ganar/edicion-fin-semana-11-07-21/5977613/

Manually modifying the URL before running the command (i.e. changing play for alacarta) results in a successful execution:

$ youtube-dl -v https://www.rtve.es/alacarta/videos/saber-y-ganar/edicion-fin-semana-11-07-21/5977613/
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', 'https://www.rtve.es/alacarta/videos/saber-y-ganar/edicion-fin-semana-11-07-21/5977613/']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.06.06
[debug] Python version 3.8.10 (CPython) - Linux-5.4.0-66-generic-x86_64-with-glibc2.29
[debug] exe versions: ffmpeg 4.2.4, ffprobe 4.2.4, rtmpdump 2.4
[debug] Proxy map: {}
[rtve.es:alacarta] Fetching manager info
[rtve.es:alacarta] 5977613: Downloading JSON metadata
[rtve.es:alacarta] 5977613: Downloading url information
[rtve.es:alacarta] 5977613: Downloading m3u8 information
[rtve.es:alacarta] 5977613: Downloading m3u8 information
[rtve.es:alacarta] 5977613: Downloading MPD manifest
[rtve.es:alacarta] 5977613: Downloading MPD manifest
[rtve.es:alacarta] 5977613: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 503: Service Unavailable
[rtve.es:alacarta] 5977613: Downloading m3u8 information
[rtve.es:alacarta] 5977613: Downloading m3u8 information
[rtve.es:alacarta] 5977613: Downloading MPD manifest
[rtve.es:alacarta] 5977613: Downloading MPD manifest
[rtve.es:alacarta] 5977613: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 503: Service Unavailable
[rtve.es:alacarta] 5977613: Downloading m3u8 information
[rtve.es:alacarta] 5977613: Downloading m3u8 information
[rtve.es:alacarta] 5977613: Downloading MPD manifest
[rtve.es:alacarta] 5977613: Downloading MPD manifest
[rtve.es:alacarta] 5977613: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 503: Service Unavailable
[rtve.es:alacarta] 5977613: Downloading m3u8 information
[rtve.es:alacarta] 5977613: Downloading m3u8 information
[rtve.es:alacarta] 5977613: Downloading MPD manifest
[rtve.es:alacarta] 5977613: Downloading MPD manifest
[rtve.es:alacarta] 5977613: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 503: Service Unavailable
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'http://rtve-hlsvod.secure.footprint.net/resources/TE_NGVA/mp4/4/6/1625644120564.mp4/video.mpd?idasset=5977613'
[dashsegments] Total fragments: 550
[download] Destination: Saber y ganar. Edición de fin de semana - 11_07_21-5977613.fdash-video=4000000-1.mp4

... (Log was cut)

Ideas for fix:

dirkf commented 3 years ago

Just changing the URL pattern appears to fix this, as expected from the problem description: eg, change the fragment (alacarta/videos|filmoteca) in the definition of _VALID_URL to ((alacarta|play)/videos|filmoteca) in extractor/rtve.py.

This is actually a third option, supporting play and alacarta together. The name of the extractor could be changed to reflect the new service name; then:

# youtube-dl -v -F --ignore-config 'https://www.rtve.es/play/videos/saber-y-ganar/edicion-fin-semana-11-07-21/5977613/' 
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'-F', u'--ignore-config', u'https://www.rtve.es/play/videos/saber-y-ganar/edicion-fin-semana-11-07-21/5977613/']
[debug] Encodings: locale ASCII, fs ASCII, out ASCII, pref ASCII
[debug] youtube-dl version 2021.06.06.1
[debug] Python version 2.7.1 (CPython) - Linux-2.6.18-7.1-7405b0-smp-with-libc0
[debug] exe versions: ffmpeg 4.1, ffprobe 4.1
[debug] Proxy map: {}
[rtve.es:play] Fetching manager info
[rtve.es:play] 5977613: Downloading JSON metadata
[rtve.es:play] 5977613: Downloading url information
[rtve.es:play] 5977613: Downloading m3u8 information
[rtve.es:play] 5977613: Downloading m3u8 information
[rtve.es:play] 5977613: Downloading MPD manifest
[rtve.es:play] 5977613: Downloading MPD manifest
[rtve.es:play] 5977613: Downloading m3u8 information
[rtve.es:play] 5977613: Downloading m3u8 information
[rtve.es:play] 5977613: Downloading m3u8 information
[rtve.es:play] 5977613: Downloading MPD manifest
[rtve.es:play] 5977613: Downloading MPD manifest
[rtve.es:play] 5977613: Downloading m3u8 information
[rtve.es:play] 5977613: Downloading m3u8 information
[rtve.es:play] 5977613: Downloading m3u8 information
[rtve.es:play] 5977613: Downloading MPD manifest
[rtve.es:play] 5977613: Downloading MPD manifest
[rtve.es:play] 5977613: Downloading m3u8 information
[rtve.es:play] 5977613: Downloading m3u8 information
[rtve.es:play] 5977613: Downloading m3u8 information
[rtve.es:play] 5977613: Downloading MPD manifest
[rtve.es:play] 5977613: Downloading MPD manifest
[rtve.es:play] 5977613: Downloading m3u8 information
[info] Available formats for 5977613:
format code           extension  resolution note
dash-audio=128473-0   m4a        audio only DASH audio  128k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-audio=128473-1   m4a        audio only DASH audio  128k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-audio=128473-2   m4a        audio only DASH audio  128k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-audio=128473-3   m4a        audio only DASH audio  128k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-audio=192642-0   m4a        audio only DASH audio  192k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-audio=192642-1   m4a        audio only DASH audio  192k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-audio=192642-2   m4a        audio only DASH audio  192k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-audio=192642-3   m4a        audio only DASH audio  192k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-video=1000000-0  mp4        640x360    DASH video 1000k , mp4_dash container, avc1.640029, 25fps, video only
dash-video=1000000-1  mp4        640x360    DASH video 1000k , mp4_dash container, avc1.640029, 25fps, video only
dash-video=1850000-0  mp4        1024x576   DASH video 1850k , mp4_dash container, avc1.640029, 25fps, video only
dash-video=1850000-1  mp4        1024x576   DASH video 1850k , mp4_dash container, avc1.640029, 25fps, video only
dash-video=2750000-0  mp4        1280x720   DASH video 2750k , mp4_dash container, avc1.640029, 25fps, video only
dash-video=2750000-1  mp4        1280x720   DASH video 2750k , mp4_dash container, avc1.640029, 25fps, video only
dash-video=4000000-0  mp4        1920x1080  DASH video 4000k , mp4_dash container, avc1.640029, 25fps, video only
dash-video=4000000-1  mp4        1920x1080  DASH video 4000k , mp4_dash container, avc1.640029, 25fps, video only
hls-1197-0            mp4        640x360    1197k , avc1.640029@1000k, 25.0fps, mp4a.40.2@128k
hls-1197-1            mp4        640x360    1197k , avc1.640029@1000k, 25.0fps, mp4a.40.2@128k
hls-1197-2            mp4        640x360    1197k , avc1.640029@1000k, 25.0fps, mp4a.40.2@128k
hls-2098-0            mp4        1024x576   2098k , avc1.640029@1850k, 25.0fps, mp4a.40.2@128k
hls-2098-1            mp4        1024x576   2098k , avc1.640029@1850k, 25.0fps, mp4a.40.2@128k
hls-3120-0            mp4        1280x720   3120k , avc1.640029@2750k, 25.0fps, mp4a.40.2@192k
hls-3120-1            mp4        1280x720   3120k , avc1.640029@2750k, 25.0fps, mp4a.40.2@192k
hls-3120-2            mp4        1280x720   3120k , avc1.640029@2750k, 25.0fps, mp4a.40.2@192k
hls-4445-0            mp4        1920x1080  4445k , avc1.640029@4000k, 25.0fps, mp4a.40.2@192k
hls-4445-1            mp4        1920x1080  4445k , avc1.640029@4000k, 25.0fps, mp4a.40.2@192k
hls-4445-2            mp4        1920x1080  4445k , avc1.640029@4000k, 25.0fps, mp4a.40.2@192k
Alta                  mp4        unknown    
HQ                    mp4        unknown    
HD_READY              mp4        unknown    
HD_FULL               mp4        unknown    (best)
#
arturoherrero commented 2 years ago

Possible fix https://github.com/ytdl-org/youtube-dl/pull/29816.