ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
132.03k stars 10.01k forks source link

[YouTube] Implement optional YouTube server-imposed throttling bypass #28859

Closed leveled-up closed 2 years ago

leveled-up commented 3 years ago

Checklist

Description

First of all, thanks for your amazing software πŸ‘πŸ»

Recently the following issue is occurring increasingly frequent: In some cases when downloading from YouTube (especially YT Music), the server throttles download speed to 58-60 KiB/s. (My internet connection however is stable and has a constant 4-6 MiB/s bandwith.) This happens in a pattern that I cannot describe more precisely right now. However it seems mostly random but associated to the download url from the extracted info_dict, since --http-chunk-size doesn't solve the issue but interrupting the download completely and immediately restarting (video is extracted again) solves the issue most of the time.

If nobody else can come up with another solution, I would recommend implementing a new CLI flag that will cause a check of the download speed against a threshold and will interrupt download and restart the extraction completely if the download speed falls under the threshold (for some amount of time?).

I would be incredibly grateful for a workaround! :heart: I don't feel comfortable enough in Python to implement one myself and request a push... πŸ˜‰

brentil commented 3 years ago

I've been experiencing this issue as well with getting my Watch Later videos using youtube-dl. I noticed my scripts were taking an entire day to run instead of minutes. I did a bunch of test runs and found that if I downloaded items rapidly at full speed I would start to get throttled by Youtube. I tried implementing a variety of settings and finally found that randomizing time between shows and capping the speed kept me from being throttled for the last month. --sleep-interval 120 --max-sleep-interval 300 --limit-rate 4M

Until today... I stated getting some videos throttled again but at least unlike last time it was a couple of them and not all of them. I've started expanding my minimum sleep interval to 200 to see if that helps.

wajhk commented 3 years ago

في Ψ§Ψ«Ω†ΩŠΩ†ΨŒ 14 ΩŠΩˆΩ†ΩŠΩˆΨŒ 2021 في 5:48 Ω…ΨŒ ΩƒΨͺΨ¨ Leith Tussing < @.***>:

I've been experiencing this issue as well with getting my Watch Later videos using youtube-dl. I noticed my scripts were taking an entire day to run instead of minutes. I did a bunch of test runs and found that if I downloaded items rapidly at full speed I would start to get throttled by Youtube. I tried implementing a variety of settings and finally found that randomizing time between shows and capping the speed kept me from being throttled for the last month. --sleep-interval 120 --max-sleep-interval 300 --limit-rate 4M

Until today... I stated getting some videos throttled again but at least unlike last time it was a couple of them and not all of them. I've started expanding my minimum sleep interval to 200 to see if that helps.

β€” You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ytdl-org/youtube-dl/issues/28859#issuecomment-860745241, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUO2ANZ5M7XJENUUEVJDVNLTSYJDJANCNFSM43QM4GCQ .

ewtoombs commented 2 years ago

I only used to see this about once in 10 attempts, but now, every video I try to download with youtube-dl is throttled to about 80 KB/s. Changing my IP address doesn't have any effect. It looks to me as though throttling was an experimental feature, and when the experiment was a success, youtube deployed it to all of their content servers.

There is hope though. Streaming on mpv doesn't work any more, but the same video will still stream just fine in-browser, proving there is a way of bypassing the throttling! youtube-dl just needs to pretend a little harder to be a real web browser.

ewtoombs commented 2 years ago

https://www.youtube.com/watch?v=djcrJQ26tyg

ewtoombs commented 2 years ago

But literally any video I try this with, I see this throttling lol. I see the same throttling when I'm only extracting audio from a video, as well.

ewtoombs commented 2 years ago

Interesting.

Im not going to download a two hour video in HD to prove a point, but just for testing, I am able to download the 240p pretty fast:

PS C:\> youtube.exe -a -1 -v 242 djcrJQ26tyg
7% bytes=10000000-19999999 10.0 MB/s
14% bytes=20000000-29999999 10.0 MB/s
22% bytes=30000000-39999999 10.0 MB/s
29% bytes=40000000-49999999 10.0 MB/s
36% bytes=50000000-59999999 8.3 MB/s
44% bytes=60000000-69999999 8.6 MB/s
51% bytes=70000000-79999999 8.8 MB/s
58% bytes=80000000-89999999 8.0 MB/s
66% bytes=90000000-99999999 8.2 MB/s
73% bytes=100000000-109999999 8.3 MB/s
80% bytes=110000000-119999999 8.5 MB/s
88% bytes=120000000-129999999 8.6 MB/s
95% bytes=130000000-139999999 8.1 MB/s

https://github.com/89z/mech

So, to be clear, you are doing this download with mech, and not youtube-dl?

ewtoombs commented 2 years ago

This suggests that youtube has managed to specifically fingerprint youtube-dl usage patterns. Let me try mech to verify.

ewtoombs commented 2 years ago

I just tried mech. (I just made an AUR package for it :D) I observe no throttling.

ewtoombs commented 2 years ago

That is a good idea, @89z. I made an issue at mech for further measures. See https://github.com/89z/mech/issues/8.

ewtoombs commented 2 years ago

After more experimentation, and some very helpful hints from mech, I have been able to reproduce the speeds that mech is capable of. I have found that two things are necessary to bypass throttling:

  1. Use media links with c set to ANDROID instead of WEB in the query parameters. This would be easy, except for media links are signed, so they can't be edited. The only way I know of to get android media links is the same way mech does it: getting the streamingData from POST /youtubei/v1/player instead of from var ytInitialPlayerResponse in GET /watch and setting clientName to ANDROID (and setting a plausible clientVersion).
  2. Download the data in chunks of around 10MB or less using a range=<start>-<end> query parameter or a Range: header. And yes, this is what mech does lol.

The user agent is unimportant.

I've written a short python demo of the throttling bypass:

#!/usr/bin/env python

from requests import Session
import json

def api_key(response):
    from bs4 import BeautifulSoup
    import re
    soup = BeautifulSoup(response, 'html.parser')
    key = None
    for script_tag in soup.find_all('script'):
        script = script_tag.string
        if script is not None:
            match = re.search(r'"INNERTUBE_API_KEY":"([^"]+)"', script)
            if match is not None:
                key = match.group(1)
                break
    assert key is not None
    return key

id = 'yiw6_JakZFc'

session = Session()
session.headers = {
    # This is to demonstrate how little the user agent matters, and also for
    # fun.
    'User-Agent': 'Fuck you, Google!',
}

# Hit the /watch endpoint, but we actually only want an API key lol.
response = session.get(
    'https://www.youtube.com/watch',
    params={'v': id},
).content.decode()
key = api_key(response)

# OK, now use the API key to get the actual streaming data.
post_data = {
    'context': {
        'client': {
            'clientName': 'ANDROID',
            'clientVersion': '16.05',
        },
    },
    'videoId': id,
}
data = json.loads(session.post(
    'https://www.youtube.com/youtubei/v1/player',
    params={'key': key},
    data=json.dumps(post_data),
).content)

for f in data['streamingData']['adaptiveFormats']:
    if 'height' in f and f['height'] == 720:
        print(f['url']+'&range=0-10000000')
        break

The URL that this program prints will download the first 10 megs of a 720p video without throttling.

ewtoombs commented 2 years ago

I implemented the fix in my youtube client.

porg commented 2 years ago

In the past I always got full download speed via youtube-dl. Today when I used youtube-dl again after 1-2 years of not having used it, I was consistently throttling at ca. 50 KB/sec. Also after various stops and resumes. I use youtube-dl 2021.12.17 installed via brew on macOS 11.6.6.

I guess a good method to bypass throttling is to not download anonymously but as a logged in Google/Youtube user. Hence making the use of cookies as easy and user friendly as possible would very likely help most users to overcome throttling. --cookie is rather for technical users and requires much manual labor and extra software. --cookie-from-browser <MyPreferredBrowser> on the other hand would be very convenient and practically. Hence pull request https://github.com/ytdl-org/youtube-dl/pull/29201 should may be expedited in order to solve the throttling issue.

porg commented 2 years ago

I confirm that yt-dlp 2022.5.18 in comparison to youtube-dl 2021.12.17 successfully bypasses the Youtube throttle!

ewtoombs commented 2 years ago

Looks like this was about to get fixed here: https://github.com/dirkf/youtube-dl/pull/6 Then, nothing happened. Also, looks like this is a duplicate of https://github.com/ytdl-org/youtube-dl/issues/29326 .

Elmar1991 commented 2 years ago

Hello, did you solve the problem? and how did you do that?

dirkf commented 2 years ago

The following merged PRs fix this issue:

To benefit from these you have to install from the git master until the useless sob maintainer makes a new release.

gamer191 commented 2 years ago

To benefit from these you have to install from the git master until the useless sob maintainer makes a new release

I'm dying🀣

gamer191 commented 2 years ago

the useless sob maintainer

legend has it he's the worst github maintainer since that Gamer191 guy, who tried to create a repository that's sole purpose was to host a useless yt-dlp git diff