yt-dlp / yt-dlp

A feature-rich command-line audio/video downloader
https://discord.gg/H5MNcFW63r
The Unlicense
90.79k stars 7.05k forks source link

[nitter] video extraction doesn't work #5396

Open l29ah opened 2 years ago

l29ah commented 2 years ago

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

Checklist

Region

No response

Provide a description that is worded well enough to be understood

https://nitter.lacontrevoie.fr/small10space/status/1586391130429882368 doesn't get fetched while it's twitter.com counterpart does

Provide verbose output that clearly demonstrates the problem

Complete Verbose Output

[debug] Command-line config: ['-vU', 'https://nitter.lacontrevoie.fr/small10space/status/1586391130429882368']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.10.04 [4e0511f] (pip) API
[debug] Python 3.10.8 (CPython 64bit) - Linux-6.0.2+-x86_64-with-glibc2.36 (glibc 2.36)
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
[debug] exe versions: ffmpeg 4.4.3 (fdk,setts), ffprobe 4.4.3, rtmpdump 2.4
[debug] Optional libraries: Crypto-3.15.0, brotli-1.0.9, certifi-3021.03.16, mutagen-1.46.0, sqlite3-2.6.0
[debug] Proxy map: {}
[debug] Loaded 1690 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.10.04, Current version: 2022.10.04
yt-dlp is up to date (2022.10.04)
[debug] [generic] Extracting URL: https://nitter.lacontrevoie.fr/small10space/status/1586391130429882368
[generic] 1586391130429882368: Downloading webpage
WARNING: [generic] Falling back on generic information extractor
[generic] 1586391130429882368: Extracting information
[debug] Looking for Brightcove embeds
[debug] Looking for embeds
ERROR: Unsupported URL: https://nitter.lacontrevoie.fr/small10space/status/1586391130429882368
Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/yt_dlp/YoutubeDL.py", line 1477, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/lib/python3.10/site-packages/yt_dlp/YoutubeDL.py", line 1553, in __extract_info
    ie_result = ie.extract(url)
  File "/usr/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 672, in extract
    ie_result = self._real_extract(url)
  File "/usr/lib/python3.10/site-packages/yt_dlp/extractor/generic.py", line 3062, in _real_extract
    raise UnsupportedError(url)
yt_dlp.utils.UnsupportedError: Unsupported URL: https://nitter.lacontrevoie.fr/small10space/status/1586391130429882368
weisenhan commented 2 years ago

Video is NSFW The twitter link states file name is too long.

yt-dlp -Uv https://twitter.com/small10space/status/1586391130429882368
[debug] Command-line config: ['-Uv', 'https://twitter.com/small10space/status/1586391130429882368']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.10.04 [4e0511f] (zip)
[debug] Python 3.8.10 (CPython 64bit) - Linux-5.4.0-131-generic-x86_64-with-glibc2.29 (glibc 2.31)
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
[debug] exe versions: ffmpeg 4.2.7, ffprobe 4.2.7, phantomjs 5
[debug] Optional libraries: Cryptodome-3.6.1, brotli-1.0.9, certifi-2019.11.28, mutagen-1.45.1, secretstorage-2.3.1, sqlite3-2.6.0, websockets-10.2
[debug] Proxy map: {}
[debug] Loaded 1690 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.10.04, Current version: 2022.10.04
yt-dlp is up to date (2022.10.04)
[debug] [twitter] Extracting URL: https://twitter.com/small10space/status/1586391130429882368
[twitter] 1586391130429882368: Downloading guest token
[twitter] 1586391130429882368: Downloading JSON metadata
[twitter] 1586391130429882368: Downloading m3u8 information
[debug] Sort order given by extractor: res, br, size, proto
[debug] Formats sorted by: hasvid, ie_pref, res, tbr, vbr, abr, filesize, fs_approx, proto, lang, quality, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, asr, vext, aext, hasaud, source, id
[debug] Default format spec: bestvideo*+bestaudio/best
[info] 1586391130429882368: Downloading 1 format(s): http-832
[debug] Invoking http downloader on "https://video.twimg.com/ext_tw_video/1586391051320922112/pu/vid/800x320/Lq7JaVQSDHYpEkGv.mp4?tag=12"
ERROR: unable to open for writing: [Errno 36] File name too long: 'Мисливець за зорями - Окр вирішив схитрувати,та кинути гранату коли здавався у полон,але нахитрував собі повну тушу куль.Усім іншім,які поводили себе нормально ЗСУ зберегли життя як обіцяли [1586391130429882368].mp4.part'
Traceback (most recent call last):
  File "/usr/local/bin/yt-dlp/yt_dlp/utils.py", line 632, in sanitize_open
    stream = locked_file(filename, open_mode, block=False).__enter__()
  File "/usr/local/bin/yt-dlp/yt_dlp/utils.py", line 2162, in __init__
    self.f = os.fdopen(os.open(filename, flags, 0o666), mode, encoding=encoding)
OSError: [Errno 36] File name too long: 'Мисливець за зорями - Окр вирішив схитрувати,та кинути гранату коли здавався у полон,але нахитрував собі повну тушу куль.Усім іншім,які поводили себе нормально ЗСУ зберегли життя як обіцяли [1586391130429882368].mp4.part'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/yt-dlp/yt_dlp/downloader/http.py", line 267, in download
    ctx.stream, ctx.tmpfilename = self.sanitize_open(
  File "/usr/local/bin/yt-dlp/yt_dlp/downloader/common.py", line 237, in wrapper
    retry.error_callback(err, 1, 0)
  File "/usr/local/bin/yt-dlp/yt_dlp/downloader/common.py", line 223, in error_callback
    return RetryManager.report_retry(
  File "/usr/local/bin/yt-dlp/yt_dlp/utils.py", line 5870, in report_retry
    raise e
  File "/usr/local/bin/yt-dlp/yt_dlp/downloader/common.py", line 232, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/bin/yt-dlp/yt_dlp/downloader/common.py", line 243, in sanitize_open
    f, filename = sanitize_open(filename, open_mode)
  File "/usr/local/bin/yt-dlp/yt_dlp/utils.py", line 634, in sanitize_open
    stream = open(filename, open_mode)
OSError: [Errno 36] File name too long: 'Мисливець за зорями - Окр вирішив схитрувати,та кинути гранату коли здавався у полон,але нахитрував собі повну тушу куль.Усім іншім,які поводили себе нормально ЗСУ зберегли життя як обіцяли [1586391130429882368].mp4.part'

Try direct link and use the -o parameter to rename for now.

2011 commented 11 months ago

Found this old issue, and decided to add to it rather than opening a new one.

The recent twitter changes have led to a huge churn it working nitter intstances (both new sites coming online and previously working sites going offline), which has reduced the effectiveness of yt-dlp in recognizing nitter URLs.

For example:

$ /tmp/yt-dlp -v 'https://nitter.woodland.cafe/KimDotcom/status/1730230555000574003'
[debug] Command-line config: ['-v', 'https://nitter.woodland.cafe/KimDotcom/status/1730230555000574003']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version nightly@2023.11.29.232714 from yt-dlp/yt-dlp-nightly-builds [6a9c7a2b5] (zip)
[debug] Python 3.11.6 (CPython x86_64 64bit) - Linux-6.1.28-gentoo-x86_64-Intel-R-_Core-TM-_i5-3470_CPU_@_3.20GHz-with-glibc2.37 (OpenSSL 3.0.11 19 Sep 2023, glibc 2.37)
[debug] exe versions: ffmpeg 4.4.4 (setts), ffprobe 4.4.4
[debug] Optional libraries: certifi-3021.03.16, pycrypto-3.19.0, requests-2.31.0, urllib3-2.1.0
[debug] Proxy map: {}
[debug] Request Handlers: urllib, requests
[debug] Loaded 1792 extractors
[generic] Extracting URL: https://nitter.woodland.cafe/KimDotcom/status/1730230555000574003
[generic] 1730230555000574003: Downloading webpage
WARNING: [generic] Falling back on generic information extractor
[generic] 1730230555000574003: Extracting information
[debug] Looking for embeds
ERROR: Unsupported URL: https://nitter.woodland.cafe/KimDotcom/status/1730230555000574003
Traceback (most recent call last):
  File "/tmp/yt-dlp/yt_dlp/YoutubeDL.py", line 1570, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/yt-dlp/yt_dlp/YoutubeDL.py", line 1705, in __extract_info
    ie_result = ie.extract(url)
                ^^^^^^^^^^^^^^^
  File "/tmp/yt-dlp/yt_dlp/extractor/common.py", line 717, in extract
    ie_result = self._real_extract(url)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/yt-dlp/yt_dlp/extractor/generic.py", line 2531, in _real_extract
    raise UnsupportedError(url)
yt_dlp.utils.UnsupportedError: Unsupported URL: https://nitter.woodland.cafe/KimDotcom/status/1730230555000574003

I would like to propose that if the argument URL contains "nitter" and also "status/" followed by the numeric characters that look like a tweet identifier, that yt-dlp attempt to use the twitter extractor rather than fall back to the generic extractor.