yt-dlp / yt-dlp

A feature-rich command-line audio/video downloader
https://discord.gg/H5MNcFW63r
The Unlicense
89.93k stars 6.98k forks source link

Please support welt.de (formerly N24 Doku) Mediathek #7513

Open z1atk0 opened 1 year ago

z1atk0 commented 1 year ago

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

Checklist

Region

At least DACH (.de, .at, .ch), but probably worldwide, I guess?

Example URLs

Working, but without proper format info:

https://www.welt.de/mediathek/serie/sendung218509518/Strip-the-Cosmos-Geheimnisvoller-Jupiter.html https://www.welt.de/mediathek/dokumentation/technik-und-wissen/sendung155731159/Zielscheibe-Erde-Angriff-aus-dem-All-1.html https://www.welt.de/mediathek/dokumentation/space/strip-the-cosmos/sendung190852337/Strip-the-Cosmos-Raetselhafte-Schwarze-Loecher.html

Not working at all:

https://www.welt.de/mediathek/dokumentation/space/spacetime/sendung241530977/Spacetime-Der-geheime-Orbit-Wettruesten-im-All.html https://www.welt.de/mediathek/dokumentation/space/spacetime/sendung241589211/Spacetime-Es-werde-Licht-Religion-und-Astronomie.html https://www.welt.de/mediathek/dokumentation/space/spacetime/sendung242094261/Spacetime-Teleskope-die-Entdeckungsmaschinen.html

Provide a description that is worded well enough to be understood

Downloading from WELT-Mediathek (https://www.welt.de/mediathek/) partially works with the generic extractor (but without proper format info), but some videos can not be downloaded at all.

Provide verbose output that clearly demonstrates the problem

Complete Verbose Output

Working, but without format info:

[zlatko@disclosure:~/mnt/nas/_/Downloads]$ yt-dlp -vU --proxy "" -F https://www.welt.de/mediathek/dokumentation/space/strip-the-cosmos/sendung190852337/Strip-the-Cosmos-Raetselhafte-Schwarze-Loecher.html
[debug] Command-line config: ['-vU', '--proxy', '', '-F', 'https://www.welt.de/mediathek/dokumentation/space/strip-the-cosmos/sendung190852337/Strip-the-Cosmos-Raetselhafte-Schwarze-Loecher.html']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2023.06.22 [812cdfa06]
[debug] Lazy loading extractors is disabled
[debug] Python 3.9.17 (CPython x86_64 64bit) - Linux-5.15.118-x86_64-Intel-R-_Core-TM-_i7-2600K_CPU_@_3.40GHz-with-glibc2.33 (OpenSSL 1.1.1u  30 May 2023, glibc 2.33)
[debug] exe versions: ffmpeg 6.0 (setts), ffprobe 6.0, rtmpdump 2.4
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2021.10.08, mutagen-1.45.1, secretstorage-3.3.1, sqlite3-2.6.0, websockets-10.4
[debug] Proxy map: {}
[debug] Loaded 1851 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Available version: stable@2023.06.22, Current version: stable@2023.06.22
yt-dlp is up to date (stable@2023.06.22)
[generic] Extracting URL: https://www.welt.de/mediathek/dokumentation/space/strip-the-cosmos/sendung190852337/Strip-the-Cosmos-Raetselhafte-Schwarze-Loecher.html
[generic] Strip-the-Cosmos-Raetselhafte-Schwarze-Loecher: Downloading webpage
WARNING: [generic] Falling back on generic information extractor
[generic] Strip-the-Cosmos-Raetselhafte-Schwarze-Loecher: Extracting information
[debug] Looking for embeds
[debug] Identified a html5 embed
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, size, br, asr, proto, vext, aext, hasaud, source, id
[info] Available formats for Strip-the-Cosmos-Raetselhafte-Schwarze-Loecher-1:
ID EXT RESOLUTION │ PROTO │ VCODEC  ACODEC
───────────────────────────────────────────
0  mp4 unknown    │ https │ unknown unknown
1  mp4 unknown    │ https │ unknown unknown
2  mp4 unknown    │ https │ unknown unknown
3  mp4 unknown    │ https │ unknown unknown

Not working at all:

[zlatko@disclosure:~/mnt/nas/_/Downloads]$ yt-dlp -vU --proxy "" -F https://www.welt.de/mediathek/dokumentation/space/spacetime/sendung241530977/Spacetime-Der-geheime-Orbit-Wettruesten-im-All.html
[debug] Command-line config: ['-vU', '--proxy', '', '-F', 'https://www.welt.de/mediathek/dokumentation/space/spacetime/sendung241530977/Spacetime-Der-geheime-Orbit-Wettruesten-im-All.html']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2023.06.22 [812cdfa06]
[debug] Lazy loading extractors is disabled
[debug] Python 3.9.17 (CPython x86_64 64bit) - Linux-5.15.118-x86_64-Intel-R-_Core-TM-_i7-2600K_CPU_@_3.40GHz-with-glibc2.33 (OpenSSL 1.1.1u  30 May 2023, glibc 2.33)
[debug] exe versions: ffmpeg 6.0 (setts), ffprobe 6.0, rtmpdump 2.4
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2021.10.08, mutagen-1.45.1, secretstorage-3.3.1, sqlite3-2.6.0, websockets-10.4
[debug] Proxy map: {}
[debug] Loaded 1851 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Available version: stable@2023.06.22, Current version: stable@2023.06.22
yt-dlp is up to date (stable@2023.06.22)
[generic] Extracting URL: https://www.welt.de/mediathek/dokumentation/space/spacetime/sendung241530977/Spacetime-Der-geheime-Orbit-Wettruesten-im-All.html
[generic] Spacetime-Der-geheime-Orbit-Wettruesten-im-All: Downloading webpage
WARNING: [generic] Falling back on generic information extractor
[generic] Spacetime-Der-geheime-Orbit-Wettruesten-im-All: Extracting information
[debug] Looking for embeds
ERROR: Unsupported URL: https://www.welt.de/mediathek/dokumentation/space/spacetime/sendung241530977/Spacetime-Der-geheime-Orbit-Wettruesten-im-All.html
Traceback (most recent call last):
  File "/usr/local/lib64/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 1555, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/lib64/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 1631, in __extract_info
    ie_result = ie.extract(url)
  File "/usr/local/lib64/python3.9/site-packages/yt_dlp/extractor/common.py", line 708, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/lib64/python3.9/site-packages/yt_dlp/extractor/generic.py", line 2568, in _real_extract
    raise UnsupportedError(url)
yt_dlp.utils.UnsupportedError: Unsupported URL: https://www.welt.de/mediathek/dokumentation/space/spacetime/sendung241530977/Spacetime-Der-geheime-Orbit-Wettruesten-im-All.html
gamer191 commented 1 year ago

Not working at all:

https://www.welt.de/mediathek/dokumentation/space/spacetime/sendung241530977/Spacetime-Der-geheime-Orbit-Wettruesten-im-All.html https://www.welt.de/mediathek/dokumentation/space/spacetime/sendung241589211/Spacetime-Es-werde-Licht-Religion-und-Astronomie.html https://www.welt.de/mediathek/dokumentation/space/spacetime/sendung242094261/Spacetime-Teleskope-die-Entdeckungsmaschinen.html

It doesn't look like any of those videos are playable in a browser either. Or are they geo-blocked?

z1atk0 commented 1 year ago

Sorry, seems you're right, my bad! There's a note saying "Diese Sendung ist zur Zeit aus lizenzrechtlichen Gründen nicht verfügbar.", which roughly translates to "This program is currently unavailable due to licensing reasons." I also tried from a server hosted in Germany (I live in Austria), but the result is the same both with yt-dlp and in a browser, so it's probably not geo-blocked.

Still, it would be nice to have at least the available ones working with proper format info. :slightly_smiling_face:

dirkf commented 1 year ago

At least the first test page has HTML5 video and metadata in a ld+json block. As the generic extractor sees the HTML5 video it probably ignores the ld+json.