yt-dlp / yt-dlp

A feature-rich command-line audio/video downloader
https://discord.gg/H5MNcFW63r
The Unlicense
78.75k stars 6.19k forks source link

[Windows] [Twitter] "[Errno 2] No such file or directory" #1999

Closed a-raccoon closed 2 years ago

a-raccoon commented 2 years ago

Checklist

Region

No response

Description

yt-dlp experiences recurrent issues with grabbing videos from Twitter owing to Windows filesystem errors from erroneous characters or total length. sanitize isn't doing its job.

ref https://github.com/ytdl-org/youtube-dl/issues/30361

Verbose log

F:\Youtube-DL>yt-dlp.exe --ignore-config --verbose https://twitter.com/cassandrawebbtv/status/1470124083547418624
[debug] Command-line config: ['--ignore-config', '--verbose', 'https://twitter.com/cassandrawebbtv/status/1470124083547418624']
[debug] Encodings: locale cp1252, fs utf-8, out utf-8 (No ANSI), err utf-8 (No ANSI), pref cp1252
[debug] yt-dlp version 2021.12.01 [91f071a] (win_exe)
[debug] Python version 3.8.10 (CPython 64bit) - Windows-7-6.1.7601-SP1
[debug] exe versions: ffmpeg n4.4-6-g7e9b9f24df (setts), ffprobe n4.4-6-g7e9b9f24df
[debug] Optional libraries: Cryptodome, mutagen, sqlite, websockets
[debug] Proxy map: {}
[debug] [twitter] Extracting URL: https://twitter.com/cassandrawebbtv/status/1470124083547418624
[twitter] 1470124083547418624: Downloading guest token
[twitter] 1470124083547418624: Downloading JSON metadata
[twitter] 1470124083547418624: Downloading m3u8 information
[debug] Sort order given by extractor: res, br, size, proto
[debug] Formats sorted by: hasvid, ie_pref, res, tbr, vbr, abr, filesize, fs_approx, proto, lang, quality, fps, hdr:12(7), vcodec:vp9.2(10), acodec, asr, vext, aext, hasaud, source, id
[debug] Default format spec: bestvideo*+bestaudio/best
[info] 1470124083547418624: Downloading 1 format(s): http-2176
[debug] Invoking downloader on "https://video.twimg.com/ext_tw_video/1470124003201273856/pu/vid/720x1060/KKdVKnZxI47Oihej.mp4?tag=12"
ERROR: unable to open for writing: [Errno 2] No such file or directory: 'Cassandra Webb - MUST WATCH - A man from #Kentucky lost his home after the #tornado Yet, here he sits at his piano playing the @Gaithermusic tune, "There's Something About That Name." The peace that passes understanding. #ARwx @FOX16News @KARK4News @NWS @HaydenNix [1470124083547418624].mp4.part'
Traceback (most recent call last):
  File "yt_dlp\downloader\http.py", line 266, in download
  File "yt_dlp\utils.py", line 2094, in sanitize_open
FileNotFoundError: [Errno 2] No such file or directory: 'Cassandra Webb - MUST WATCH - A man from #Kentucky lost his home after the #tornado Yet, here he sits at his piano playing the @Gaithermusic tune, "There's Something About That Name." The peace that passes understanding. #ARwx @FOX16News @KARK4News @NWS @HaydenNix [1470124083547418624].mp4.part'
pukkandan commented 2 years ago

Duplicate of https://github.com/yt-dlp/yt-dlp/issues/1280 and related issues

a-raccoon commented 2 years ago

To be certain, it's a completely different error message. Likely caused by the inclusion of literal quotation marks in the filename, which is illegal. File-too-long is a separate issue, and often dismissed as a "you should be using Windows 10 to fix that" bug.

I don't know if any modern filesystem supports literal quotation marks in object names.

pukkandan commented 2 years ago

Those are not quotes, but unicode chars that look like quotes

pukkandan commented 2 years ago

often dismissed as a "you should be using Windows 10 to fix that" bug.

No, even if you use Windows 10 or UNIX, this filename is too long

a-raccoon commented 2 years ago
Those are not quotes, but unicode chars that look like quotes
ERROR: unable to open for writing: [Errno 2] No such file or directory: 'Cassandra Webb - MUST WATCH - A man from #Kentucky lost his home after the #tornado Yet, here he sits at his piano playing the @Gaithermusic tune, "There's Something About That Name." The peace that passes understanding. #ARwx @FOX16News @KARK4News @NWS @HaydenNix [1470124083547418624].mp4.part'

Those are ASCII quotes of the U+0034 variety after they have been sanitized. The original tweet had fancy quotes, but yt-dl decided to convert them to ASCII quotes in the filename.

pukkandan commented 2 years ago
➤ yt-dlp -v https://twitter.com/cassandrawebbtv/status/1470124083547418624 -o "%(title).200s.%(ext)s"
[debug] Command-line config: ['-v', 'https://twitter.com/cassandrawebbtv/status/1470124083547418624', '-o', '%(title).200s.%(ext)s']
[debug] Encodings: locale cp1252, fs utf-8, out utf-8, err utf-8, pref cp1252
[debug] yt-dlp version 2021.12.01 [91f071af6] (source)
[debug] Lazy loading extractors is disabled
[debug] Plugins: ['SamplePluginIE', 'SamplePluginPP']
[debug] Git HEAD: 4eb5531d1
[debug] Python version 3.10.0 (CPython 64bit) - Windows-10-10.0.19043-SP0
[debug] exe versions: ffmpeg N-103892-g71f2a9a2e5-20210927 (setts,fdk), ffprobe N-103892-g71f2a9a2e5-20210927, phantomjs 2.1.1
[debug] Optional libraries: Cryptodome, mutagen, sqlite, websockets
[debug] Proxy map: {}
[debug] [twitter] Extracting URL: https://twitter.com/cassandrawebbtv/status/1470124083547418624
[twitter] 1470124083547418624: Downloading guest token
[twitter] 1470124083547418624: Downloading JSON metadata
[twitter] 1470124083547418624: Downloading m3u8 information
[debug] Sort order given by extractor: res, br, size, proto
[debug] Formats sorted by: hasvid, ie_pref, res, tbr, vbr, abr, filesize, fs_approx, proto, lang, quality, fps, hdr:12(7), vcodec:vp9.2(10), acodec, asr, vext, aext, hasaud, source, id
[debug] Default format spec: bestvideo*+bestaudio/best
[info] 1470124083547418624: Downloading 1 format(s): http-2176
[debug] Invoking downloader on "https://video.twimg.com/ext_tw_video/1470124003201273856/pu/vid/720x1060/KKdVKnZxI47Oihej.mp4?tag=12"
[download] Destination: Cassandra Webb - MUST WATCH - A man from #Kentucky lost his home after the #tornado Yet, here he sits at his piano playing the @Gaithermusic tune, “There’s Something About That Name.” The peace that p.mp4
[download] 100% of 5.97MiB in 00:01