blackjack4494 / yt-dlc

media downloader and library for various sites.
The Unlicense
2.9k stars 366 forks source link

[Broken] YouTube: a very small number of videos consistently fail to download #264

Open jbruchon opened 3 years ago

jbruchon commented 3 years ago

Verbose log

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-f', '137+251', '--cookies', 'Y:\\cookies.txt', '--add-metadata', '--write-description', '--write-info-json', '--write-thumbnail', '--download-archive', 'Y:\\ytdl_archive.txt', '--all-subs', '-ciw', '-o', '%(title)s.%(ext)s', '-v', 'https://www.youtube.com/watch?v=LIdZ2oPyB1Y']
[debug] Loading archive file 'Y:\\ytdl_archive.txt'
[debug] Encodings: locale cp1252, fs utf-8, out utf-8, pref cp1252
[debug] youtube-dlc version 2020.11.11-2
[debug] Python version 3.9.0 (CPython) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg git-2020-04-15-51db0a4, ffprobe 4.3.1-2020-10-01-full_build-www.gyan.dev
[debug] Proxy map: {}
[youtube] LIdZ2oPyB1Y: Downloading webpage
[info] Video description is already present
[info] Video subtitle en.vtt is already present
[info] Video subtitle tr.vtt is already present
[info] Video description metadata is already present
[youtube] LIdZ2oPyB1Y: Thumbnail is already present
WARNING: Requested formats are incompatible for merge and will be merged into mkv.
[debug] Invoking downloader on 'https://r5---sn-hp57kn7e.googlevideo.com/videoplayback?expire=1606634316&ei=6_bCX8bLNuuDzLUP5pi8oAM&ip=77.81.142.124&id=o-ABxbQ79-SQUNxEEobMptDUyFmzaAcyduZOh0portJKj-&itag=137&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278&source=youtube&requiressl=yes&mh=wM&mm=31%2C29&mn=sn-hp57kn7e%2Csn-hp57yne7&ms=au%2Crdu&mv=m&mvi=5&pl=24&initcwndbps=633750&vprv=1&mime=video%2Fmp4&ns=KsUGVusZhmzwLjUH87Cn8yEF&gir=yes&clen=249849935&dur=1342.720&lmt=1572754520057451&mt=1606612362&fvip=5&keepalive=yes&c=WEB&txp=1306222&n=eRfKKH3XA4B5i7zHf36&sparams=expire%2Cei%2Cip%2Cid%2Caitags%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Cgir%2Cclen%2Cdur%2Clmt&sig=AOq0QJ8wRQIgCkzbEg6HpwujHV6Zu9gFcCZ9rPEBjcJ3JK-iZWLrQVYCIQCZoM-u9Slf2GXAjgqhOhsTM0H-fcEx2C6NpT-Ng1sUxQ%3D%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRQIgZH75_E7jj_TWqgwOwjw5wGYRqZiq-9THO7z7BrBJHGsCIQCS3s_ppZN__VE--XDQFihwQ5A_saofyzFdOcPNBlEdBQ%3D%3D&ratebypass=yes'
ERROR: unable to download video data: <urlopen error [WinError 10051] A socket operation was attempted to an unreachable network>
Traceback (most recent call last):
  File "urllib\request.py", line 1342, in do_open
  File "http\client.py", line 1255, in request
  File "http\client.py", line 1301, in _send_request
  File "http\client.py", line 1250, in endheaders
  File "http\client.py", line 1010, in _send_output
  File "http\client.py", line 950, in send
  File "http\client.py", line 1417, in connect
  File "http\client.py", line 921, in connect
  File "socket.py", line 843, in create_connection
  File "socket.py", line 831, in create_connection
OSError: [WinError 10051] A socket operation was attempted to an unreachable network

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "youtube_dlc\YoutubeDL.py", line 2014, in process_info
  File "youtube_dlc\YoutubeDL.py", line 1875, in dl
  File "youtube_dlc\downloader\common.py", line 375, in download
  File "youtube_dlc\downloader\http.py", line 349, in real_download
  File "youtube_dlc\downloader\http.py", line 114, in establish_connection
  File "youtube_dlc\downloader\http.py", line 110, in establish_connection
  File "youtube_dlc\YoutubeDL.py", line 2325, in urlopen
  File "urllib\request.py", line 517, in open
  File "urllib\request.py", line 534, in _open
  File "urllib\request.py", line 494, in _call_chain
  File "youtube_dlc\utils.py", line 2736, in https_open
  File "urllib\request.py", line 1345, in do_open
urllib.error.URLError: <urlopen error [WinError 10051] A socket operation was attempted to an unreachable network>

Description

When I archive whole channels, some videos may fail to download due to copyright claims, but there are a few outliers like this one that work fine on YouTube and fail in yt-dlc. This might happen to one out of every 2000 videos I archive, but these same videos are failing the same way for multiple weeks across reboots and new program versions. I noticed that Waterfox seems to be using different servers than yt-dlc for some reason; it seems that the servers yt-dlc is picking up don't actually work. Here's a sample of the network traffic dev console panel for that video:

image

Also, there does appear to be one mention of the server yt-dlc is trying in the list; it shows 0 bytes transferred.

Vangelis66 commented 3 years ago

but there are a few outliers like this one that work fine on YouTube and fail in yt-dlc. This might happen to one out of every 2000 videos I archive, but these same videos are failing the same way for multiple weeks across reboots and new program versions.

I know this isn't actually helping you in any way, but linked video downloads fine here:

youtube-dlc -f 137+140 "LIdZ2oPyB1Y" -v > Log.txt 2>&1 =>

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--ffmpeg-location', '..\\ffmpeg.exe', '-f', '137+140', 'LIdZ2oPyB1Y', '-v']
[debug] Loading archive file None
[debug] Encodings: locale cp1253, fs utf-8, out cp1253, pref cp1253
[debug] youtube-dlc version 2020.11.27
[debug] Python version 3.7.9 (CPython) - Windows-Vista-6.0.6003-SP2
[debug] exe versions: ffmpeg N-97309-g4e0cf81b49, ffprobe N-97309-g4e0cf81b49
[debug] Proxy map: {}
[youtube] LIdZ2oPyB1Y: Downloading webpage
[debug] Invoking downloader on 'https://r1---sn-4vguioxu-n3bl.googlevideo.com/videoplayback?expire=1606636110&ei=7v3CX-jJB8Oj-gb43onwAw&ip=<snipped>&id=o-AAaI3ZAJCAG8EJ5cpRbKnsYlwh0FJ9ktlfybGDJ6UqcC&itag=137&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278&source=youtube&requiressl=yes&mh=wM&mm=31%2C29&mn=sn-4vguioxu-n3bl%2Csn-nv47lnsk&ms=au%2Crdu&mv=m&mvi=1&pl=18&initcwndbps=568750&vprv=1&mime=video%2Fmp4&ns=3Rsi83wvUxNnapvDgnFjRYwF&gir=yes&clen=249849935&dur=1342.720&lmt=1572754520057451&mt=1606614043&fvip=5&keepalive=yes&beids=9466585&c=WEB&txp=1306222&n=L0PDZmJemglMqzD9Mnl&sparams=expire%2Cei%2Cip%2Cid%2Caitags%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Cgir%2Cclen%2Cdur%2Clmt&sig=AOq0QJ8wRAIgfDmeDn_iOq3KKLlFb_iDfNgo4nBpaKYZgEKu_QPdXB8CIBJrV3pvMl4R6AYgMX8lVMwu7KE149x8L2SM4nnk6C70&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRQIhAO7BpKKKEGzLOZcZ_yUqEsd89ECtTOL89VdxDwrWaH22AiA77MyQ5wDfqLibmjbHiwX_HmZho7FTZFp_pmVbS_2rlQ%3D%3D&ratebypass=yes'
[download] Destination: YouTube BANNING Hacking Videos - Hot Take-LIdZ2oPyB1Y.f137.mp4

[download]   0.0% of 238.28MiB at 124.99KiB/s ETA 32:32
[download]   0.0% of 238.28MiB at 374.96KiB/s ETA 10:50
[download]   0.0% of 238.28MiB at 777.69KiB/s ETA 05:13
[download]   0.0% of 238.28MiB at  1.63MiB/s ETA 02:26 
<redacted for brevity>
[download]  99.4% of 238.28MiB at  1.09MiB/s ETA 00:01  
[download] 100.0% of 238.28MiB at  1.15MiB/s ETA 00:00  
[download] 100.0% of 238.28MiB at  1.16MiB/s ETA 00:00  
[download] 100% of 238.28MiB in 02:51                   
[debug] Invoking downloader on 'https://r1---sn-4vguioxu-n3bl.googlevideo.com/videoplayback?expire=1606636110&ei=7v3CX-jJB8Oj-gb43onwAw&ip=<snipped>&id=o-AAaI3ZAJCAG8EJ5cpRbKnsYlwh0FJ9ktlfybGDJ6UqcC&itag=140&source=youtube&requiressl=yes&mh=wM&mm=31%2C29&mn=sn-4vguioxu-n3bl%2Csn-nv47lnsk&ms=au%2Crdu&mv=m&mvi=1&pl=18&initcwndbps=568750&vprv=1&mime=audio%2Fmp4&ns=3Rsi83wvUxNnapvDgnFjRYwF&gir=yes&clen=21732228&dur=1342.786&lmt=1572754359851374&mt=1606614043&fvip=5&keepalive=yes&beids=9466585&c=WEB&txp=1301222&n=L0PDZmJemglMqzD9Mnl&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Cgir%2Cclen%2Cdur%2Clmt&sig=AOq0QJ8wRQIgar_Jx47znDOhQKQKyGyq1oUYfbaG5UGfpAARG-u3a3ICIQDHK9WbHOzKT6qi9A50_fiaXzH9E2oWH1P3LedI_Vm6sQ%3D%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRQIhAO7BpKKKEGzLOZcZ_yUqEsd89ECtTOL89VdxDwrWaH22AiA77MyQ5wDfqLibmjbHiwX_HmZho7FTZFp_pmVbS_2rlQ%3D%3D&ratebypass=yes'
[download] Destination: YouTube BANNING Hacking Videos - Hot Take-LIdZ2oPyB1Y.f140.m4a

[download]   0.0% of 20.73MiB at 19.61KiB/s ETA 18:02
[download]   0.0% of 20.73MiB at 58.82KiB/s ETA 06:00
[download]   0.0% of 20.73MiB at 134.60KiB/s ETA 02:37
[download]   0.1% of 20.73MiB at 288.43KiB/s ETA 01:13
<redacted for brevity> 
[download]  98.3% of 20.73MiB at  1.35MiB/s ETA 00:00 
[download] 100.0% of 20.73MiB at  1.45MiB/s ETA 00:00 
[download] 100% of 20.73MiB in 00:14                  
[ffmpeg] Merging formats into "YouTube BANNING Hacking Videos - Hot Take-LIdZ2oPyB1Y.mp4"
[debug] ffmpeg command line: "D:\<snipped>\ffmpeg" -y -loglevel "repeat+info" -i "file:YouTube BANNING Hacking Videos - Hot Take-LIdZ2oPyB1Y.f137.mp4" -i "file:YouTube BANNING Hacking Videos - Hot Take-LIdZ2oPyB1Y.f140.m4a" -c copy -map "0:v:0" -map "1:a:0" "file:YouTube BANNING Hacking Videos - Hot Take-LIdZ2oPyB1Y.temp.mp4"
Deleting original file YouTube BANNING Hacking Videos - Hot Take-LIdZ2oPyB1Y.f140.m4a (pass -k to keep)
Deleting original file YouTube BANNING Hacking Videos - Hot Take-LIdZ2oPyB1Y.f137.mp4 (pass -k to keep)

Both elementary streams are being served from a r1---sn-4vguioxu-n3bl.googlevideo.com domain...

one mention of the server yt-dlc is trying in the list; it shows 0 bytes transferred.

And that's because your browser can't establish a secure connection with domain r5---sn-hp57kn7e.googlevideo.com; I suggest you try and find why that is (modified hosts file? Firewall rules? AV suite? TLS/cipher issues?) ... Can you ping that server?

Microsoft Windows [Version 6.0.6003]
Copyright (c) 2006 Microsoft Corporation.  All rights reserved.

C:\Windows>ping r5---sn-hp57kn7e.googlevideo.com

Pinging r5.sn-hp57kn7e.googlevideo.com [209.85.231.11] with 32 bytes of data:
Reply from 209.85.231.11: bytes=32 time=174ms TTL=119
Reply from 209.85.231.11: bytes=32 time=173ms TTL=119
Reply from 209.85.231.11: bytes=32 time=175ms TTL=119
Reply from 209.85.231.11: bytes=32 time=175ms TTL=119

Ping statistics for 209.85.231.11:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 173ms, Maximum = 175ms, Average = 174ms

C:\Windows>

If your IP pool is digitally hoarding from that node, then it might have been blocked; have you tried whether a different IP (via, e.g., VPN, secure shell, etc.) works for that specific googlevideo domain?

it seems that the servers yt-dlc is picking up don't actually work.

How do you propose this is fixed, if at all a yt-dlc issue? Is there logic inside the youtubecomIE to try a different CDN node if the first chosen one fails to connect? Since your browser also can't connect to that culprit yt CDN, it would appear a more general connection issue is affecting your particular environment...

jbruchon commented 3 years ago

If the browser switches servers on connection failure, the program should do so as well. I appreciate the effort you put into your response, but the specifics of my environment are largely irrelevant since the video DOES work in multiple browsers from the same machine.

pukkandan commented 3 years ago

Probably related issue: #258

jbruchon commented 3 years ago

@pukkandan I've had those come up before too.

@Vangelis66

C:\Users\Owner>ping r5---sn-hp57kn7e.googlevideo.com

Pinging r5.sn-hp57kn7e.googlevideo.com [209.85.231.11] with 32 bytes of data:
Reply from 209.85.231.11: bytes=32 time=58ms TTL=123
Reply from 209.85.231.11: bytes=32 time=55ms TTL=123
Reply from 209.85.231.11: bytes=32 time=55ms TTL=123
Reply from 209.85.231.11: bytes=32 time=59ms TTL=123

Ping statistics for 209.85.231.11:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 55ms, Maximum = 59ms, Average = 56ms
Vangelis66 commented 3 years ago

Probably related issue: https://github.com/blackjack4494/yt-dlc/issues/258

Nice catch! But the errors printed are somewhat different:

https://github.com/blackjack4494/yt-dlc/issues/264:

[debug] Invoking downloader on 'https://r5---sn-hp57kn7e.googlevideo.com/videoplayback?expire=1606634316&ei=6_bCX8bLNuuDzLUP5pi8oAM&ip=77.81.142.124&id=o-ABxbQ79-SQUNxEEobMptDUyFmzaAcyduZOh0portJKj-&itag=137&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278&source=youtube&requiressl=yes&mh=wM&mm=31%2C29&mn=sn-hp57kn7e%2Csn-hp57yne7&ms=au%2Crdu&mv=m&mvi=5&pl=24&initcwndbps=633750&vprv=1&mime=video%2Fmp4&ns=KsUGVusZhmzwLjUH87Cn8yEF&gir=yes&clen=249849935&dur=1342.720&lmt=1572754520057451&mt=1606612362&fvip=5&keepalive=yes&c=WEB&txp=1306222&n=eRfKKH3XA4B5i7zHf36&sparams=expire%2Cei%2Cip%2Cid%2Caitags%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Cgir%2Cclen%2Cdur%2Clmt&sig=AOq0QJ8wRQIgCkzbEg6HpwujHV6Zu9gFcCZ9rPEBjcJ3JK-iZWLrQVYCIQCZoM-u9Slf2GXAjgqhOhsTM0H-fcEx2C6NpT-Ng1sUxQ%3D%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRQIgZH75_E7jj_TWqgwOwjw5wGYRqZiq-9THO7z7BrBJHGsCIQCS3s_ppZN__VE--XDQFihwQ5A_saofyzFdOcPNBlEdBQ%3D%3D&ratebypass=yes'
ERROR: unable to download video data: <urlopen error [WinError 10051] A socket operation was attempted to an unreachable network>

https://github.com/blackjack4494/yt-dlc/issues/258:

[debug] Invoking downloader on 'https://r4---sn-uxaxovg-vnae7.googlevideo.com/videoplayback?expire=1606380694&ei=Nhi_X9WQI5G8yQX85KmgDw&ip=82.164.193.144&id=o-AIuH8ge2YKQcxn5OSQfwBIeY8A_2agc2r5r1NCYeZV8g&itag=248&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278&source=yt_otf&requiressl=yes&mh=8P&mm=31%2C29&mn=sn-uxaxovg-vnae7%2Csn-5goeen7d&ms=au%2Crdu&mv=m&mvi=4&pl=16&initcwndbps=1718750&vprv=1&mime=video%2Fwebm&ns=qwoo8gNZ6-NLdjn6Qtp16BoF&otf=1&otfp=1&dur=879.111&lmt=1489552536986074&mt=1606358914&fvip=4&keepalive=yes&c=WEB&n=FDlsYvWQOWr4SlRWYsV&sparams=expire%2Cei%2Cip%2Cid%2Caitags%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Cotf%2Cotfp%2Cdur%2Clmt&sig=AOq0QJ8wRgIhAPqHQMisZgQLvQsPcFhPJTWCnMs6fra3CwmXp3Bwv7miAiEAyFmdM5tGBbBfwO8aPF9QKk1xqVazEmLuGtLuKFNmZf4%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRQIgdELB3vcaPYaBonZW_z5iuCctO62Opd-ojBb8RUgWT0ACIQDIlTERjRsWIOjLIJEqhGcD5HO01ndPQpcM6fXjm17MUQ%3D%3D&ratebypass=yes'
ERROR: unable to download video data: HTTP Error 404: Not Found

i.e. <urlopen error [WinError 10051] A socket operation was attempted to an unreachable network> Vs HTTP Error 404: Not Found

which, to me at least, means in the latter case the connection is being established to the node, but that then returns a 404 response; in this issue, yt-dlc doesn't even manage to connect at all (???)

blackjack4494 commented 3 years ago

This is odd. Works fine for me. But I saw some other people having similar problems.
Okay I tried some urls multiple times and this error is not consistent for me but sometimes pop up.

jbruchon commented 3 years ago

The thing that bugs me is that I've had this happen on the same few videos across hundreds of channel archives I now maintain; they simply never successfully download, so I see the downloads jam up on them for about a minute each (TCP connection establishment timeout(s), I assume) when they occur before the failure code gets thrown up and it moves on. I could use other tools to download the videos if I had the patience to snag their ID codes and manually go fetch them. The bigger problem is that this prevents me from downloading every video on a channel (thus never getting a truly complete archive) when it occurs and it persists no matter what I do network-wise on my end.