yt-dlp / yt-dlp

A feature-rich command-line audio/video downloader
https://discord.gg/H5MNcFW63r
The Unlicense
91.51k stars 7.12k forks source link

[TV5MondePlus] Extracting information Error #7044

Open Luvmon opened 1 year ago

Luvmon commented 1 year ago

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

Checklist

Region

France, Taiwan, Hong Kong, Japan, Singapore, Australia, etc. (American needs account)

Provide a description that is worded well enough to be understood

The error occured when I want to get the info of the video on TV5MondePlus.

Provide verbose output that clearly demonstrates the problem

Complete Verbose Output

[debug] Command-line config: ['-vU', '-F', 'https://www.tv5mondeplus.com/en/details/vod/107087204_74079A']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version nightly@2023.05.11.094900 [21b9413cf] (zip)
[debug] Python 3.9.2 (CPython x86_64 64bit) - Linux-5.10.0-21-cloud-amd64-x86_64-with-glibc2.31 (OpenSSL 1.1.1n  15 Mar 2022, glibc 2.31)
[debug] exe versions: none
[debug] Optional libraries: Cryptodome-3.17, brotli-1.0.9, certifi-2023.05.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-11.0.3
[debug] Proxy map: {}
[debug] Loaded 1813 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp-nightly-builds/releases/latest
Available version: nightly@2023.05.11.094900, Current version: nightly@2023.05.11.094900
Current Build Hash: 68cfb85d91da7da1a1c6f0d01b0f488b89c3275ca89fb893d6e573ba07a30d9d
yt-dlp is up to date (nightly@2023.05.11.094900)
[generic] Extracting URL: https://www.tv5mondeplus.com/en/details/vod/107087204_74079A
[generic] 107087204_74079A: Downloading webpage
WARNING: [generic] Falling back on generic information extractor
[generic] 107087204_74079A: Extracting information
[debug] Looking for embeds
ERROR: expected string or bytes-like object
Traceback (most recent call last):
  File "/usr/local/bin/yt-dlp/yt_dlp/YoutubeDL.py", line 1532, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/bin/yt-dlp/yt_dlp/YoutubeDL.py", line 1608, in __extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/common.py", line 694, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/generic.py", line 2563, in _real_extract
    embeds = list(self._extract_embeds(original_url, webpage, urlh=full_response, info_dict=info_dict))
  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/generic.py", line 2695, in _extract_embeds
    json_ld = self._search_json_ld(webpage, video_id, default={})
  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/common.py", line 1473, in _search_json_ld
    info = self._json_ld(
  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/common.py", line 1653, in _json_ld
    traverse_json_ld(json_ld)
  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/common.py", line 1627, in traverse_json_ld
    'timestamp': unified_timestamp(e.get('dateCreated')),
  File "/usr/local/bin/yt-dlp/yt_dlp/utils.py", line 1850, in unified_timestamp
    date_str = re.sub(r'\s+', ' ', re.sub(
  File "/usr/lib/python3.9/re.py", line 210, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object
dirkf commented 1 year ago

Pages like https://www.tv5mondeplus.com/en/details/vod/107087204_74079A have never been supported by the extractor. Such a page eventually leads to https://www.tv5mondeplus.com/en/player/107087204_74079A, which is geo-restricted (different countries according content "channel"). This URL pattern was rejected for DRM in https://github.com/ytdl-org/youtube-dl/issues/26559.

dirkf commented 1 year ago

See also https://github.com/yt-dlp/yt-dlp/issues/4205#issuecomment-1225245017.