yt-dlp / yt-dlp

A feature-rich command-line audio/video downloader
https://discord.gg/H5MNcFW63r
The Unlicense
91.66k stars 7.12k forks source link

[Vimeo] HTTP Error 429/403 When Using Impersonate Target #10422

Open levis-ineptias opened 4 months ago

levis-ineptias commented 4 months ago

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

Checklist

Region

Australia

Provide a description that is worded well enough to be understood

I am downloading videos from a page that embeds Vimeo videos. I was able to download successfully until a few minutes ago. Now, when I use the same command that I have been using all along, I get the error:

Got HTTP Error 429 when using impersonate target "chrome-110:windows-10". If you are using a data center IP or VPN/proxy, your IP may be blocked; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U

Has my IP been blocked by Vimeo? I have downloaded about 10-15 videos in the last 48 hours. I am able to normally access the webpage that embeds these videos on Chrome, and play these videos.

I would really appreciate any help or advice as I really want to download these videos. Thank you!

Provide verbose output that clearly demonstrates the problem

Complete Verbose Output

[debug] Command-line config: ['-vU', '-F', '--referer', 'https://www.jkyog.org/portal/spiritual-retreat-family-camp', 'https://player.vimeo.com/video/971555691']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2024.07.09 from yt-dlp/yt-dlp [7ead7332a] (pip)
[debug] Python 3.12.4 (CPython arm64 64bit) - macOS-14.5-arm64-arm-64bit (OpenSSL 3.3.1 4 Jun 2024)
[debug] exe versions: ffmpeg 7.0.1 (setts), ffprobe 7.0.1
[debug] Optional libraries: Cryptodome-3.20.0, brotli-1.1.0, certifi-2024.07.04, curl_cffi-0.5.10, mutagen-1.47.0, requests-2.32.3, sqlite3-3.46.0, urllib3-2.2.2, websockets-12.0
[debug] Proxy map: {}
[debug] Request Handlers: urllib, requests, websockets, curl_cffi
[debug] Loaded 1834 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: stable@2024.07.09 from yt-dlp/yt-dlp
yt-dlp is up to date (stable@2024.07.09 from yt-dlp/yt-dlp)
[vimeo] Extracting URL: https://player.vimeo.com/video/971555691
[vimeo] 971555691: Downloading webpage
ERROR: [vimeo] 971555691: Got HTTP Error 429 when using impersonate target "chrome-110:windows-10". If you are using a data center IP or VPN/proxy, your IP may be blocked; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
  File "/Users/username-hidden/.local/pipx/venvs/yt-dlp/lib/python3.12/site-packages/yt_dlp/extractor/common.py", line 740, in extract
    ie_result = self._real_extract(url)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username-hidden/.local/pipx/venvs/yt-dlp/lib/python3.12/site-packages/yt_dlp/extractor/vimeo.py", line 859, in _real_extract
    raise ExtractorError(
credo99 commented 3 months ago

The only way to use yt-dlp on BOTH (this a recent blow to yt-dlp by vimeo) embeded and non-embeded videos on vimeo is to get access to the m3u8 playlist on the api end of vimeo and use it with yt-dlp. The current blockages implemented by vimeo are: 1) Blocking api access to requests that do not come from a web-browser session (401/429...all the mayhem) 2) Implementing a heartbeat service that runs inside the browser session of the playing video (It may be possible that it is not mandatory to have the video continuously playing in browser until the yt-dlp download is done but at least to start it at least for 1 second and then close the browser window) 3) Monitoring number of requests and timeouts between requests per each IP and blocking them when the threshold is overpassed

So...a lot of tasks to the devs here, keeping the fingers crossed...:)

rafaelfndev commented 3 months ago

The only way to use yt-dlp on BOTH (this a recent blow to yt-dlp by vimeo) embeded and non-embeded videos on vimeo is to get access to the m3u8 playlist on the api end of vimeo and use it with yt-dlp. The current blockages implemented by vimeo are:

1. Blocking api access to requests that do not come from a web-browser session (401/429...all the mayhem)

2. Implementing a heartbeat service that runs inside the browser session of the playing video (It may be possible that it is not mandatory to have the video continuously playing in browser until the yt-dlp download is done but at least to start it at least for 1 second and then close the browser window)

3. Monitoring num ber of requests and timeouts between requests per each IP and blocking them when the threshold is overpassed

So...a lot of tasks to the devs here, keeping the fingers crossed...:)

When I access a page with an embedded video player, IDM (Internet Download Manager) allows me to download the video. Maybe this can help in understanding how to download videos or generate valid link.

When the page loads, it shows this iframe:

<iframe src="https://player.vimeo.com/video/833019403?badge=0&amp;autopause=0&amp;player_id=0&amp;app_id=58479" frameborder="0" allow="autoplay; fullscreen; setPlaybackRate; picture-in-picture" allowfullscreen="" title="Aula 1_Adjetivos, Pronomes e o Verbo “to be”_VIDEO" name="fitvid0" __idm_id__="434177"></iframe>

Screenshot_2

IDM generate this link:

https://vod-adaptive-ak.vimeocdn.com/exp=1723917977~acl=%2Fe767af00-ed04-4d0a-ab3c-ab34f681dc8b%2F%2A~hmac=0adee31c895b8d5d1d696787a5e3995e4c07563b6a0f4f0f553c5e407f4e05ac/e767af00-ed04-4d0a-ab3c-ab34f681dc8b/v2/playlist/av/primary/playlist.json?pathsig=8c953e4f~bCMtcQ7W4aDN6d7_m3ZypGLhJoC6cStGlHhB75J3PeU&qsr=1&rh=1jLyAG

And show this as reference:

https://player.vimeo.com/video/833019403?badge=0&autopause=0&player_id=0&app_id=58479

Screenshot_4

And the downloads works normally on IDM:

Screenshot_3

Video file is ok: Screenshot_6

This is an example of a restricted video, if I access the URL directly it doesn't work:

Screenshot_7

frifix commented 3 months ago

@rafaelfndev It sounds like you're not using the --referer flag in yt-dlp when downloading a private video. Personally I cannot download a private video without this flag.

bashonly commented 3 months ago

IDM extracts the direct download link from the website data that your browser has already loaded, so it is not really relevant to solving any yt-dlp problems

credo99 commented 3 months ago

IDM extracts the direct download link from the website data that your browser has already loaded, so it is not really relevant to solving any yt-dlp problems

Absolutely correct, the same you can achieve with any browser plugin that grabs vimeo videos, it will extract the playlist information from the browser session and then one can pipe the playlist to yt-dlp. For a while I was thinking that the only solution might be to have yt-dlp start a headless browser in the same folder with the yt-dlp executable and grab the playlist and then use it furthermore for download...a big programming hassle anyway...

rafaelfndev commented 3 months ago

@rafaelfndev It sounds like you're not using the --referer flag in yt-dlp when downloading a private video. Personally I cannot download a private video without this flag.

It worked!

yt-dlp --referer https://example.com/allowed-website-referer https://player.vimeo.com/video/833019403

On closer inspection, it was a noob mistake 🤦‍♀️. I'm using the lib node-website-scraper to clone entire website, and I do several validations and pass several parameters to node-website-scraper before calling yt-dlp. So after I saved the entire website, I forgot that the parameters passed to node-website-scraper are not passed to yt-dlp (my big mistake), so it didn't work.

Thanks for shedding some light on this.

ela738362 commented 3 months ago

This guy managed to fix the issue: https://github.com/limontec/vm-video-downloader Any chance to get it implemented on yt-dlp?

seproDev commented 3 months ago

@ela738362 Their script just forces the user to do the extraction manually and then calls youtube-dl (the software yt-dlp was forked off). You can do the same thing with yt-dlp already. The problem is the automatic extraction due to Vimeo implementing aggressive anti bot measures.

asiaminor2k commented 2 months ago

Hi, New user. Apologies if I posted incorrectly. Just wanted to ask to 2x-check... I used this command on Windows to check for the source format on a video known to have it available: yt-dlp https://vimeo.com/931836736 --list-formats --impersonate="Chrome-110" --cookies I did not see a source format available. I used a Windows nightly build from within the past 5 days. Just wanted to ask if the command ^ was written properly/correctly? Let me know if any further details are required. Thanks

bashonly commented 2 months ago

@asiaminor2k You shouldn't need to pass --impersonate at all. If your issue is about the source/original format, then that is not directly related to this issue and you should open a new one with complete verbose output

asiaminor2k commented 2 months ago

@bashonly Understood. Yes, the issue is about the source/original format. I thought maybe the impersonation command was incorrect so I asked here. I will open a new issue for my inquiry. Thanks.