JuanBindez / pytubefix

Python3 library for downloading YouTube Videos.
https://pytubefix.readthedocs.io
MIT License
722 stars 100 forks source link

Is it never downloaded in virtual machine? #272

Closed jjikkobro closed 2 weeks ago

jjikkobro commented 1 month ago

Describe the bug I’ve tried using proxies and tokens, but I still couldn’t get the video to download on the virtual machine. If anyone has successfully managed to do this in a VM, please let me know. I don’t want to waste any more time on this.


code that was used that resulted in the bug

from pytubefix import YouTube

proxies = {
                        'http': f'socks5://{username}:{password}@{proxy_ip}:{proxy_port}',
                        'https': f'socks5://{username}:{password}@{proxy_ip}:{proxy_port}'
                    }

visitor_data, po_token = await self.get_tokens()
yt = YouTube(url, proxies=proxies, use_po_token=True, po_token_verifier=(visitor_data, po_token))

Expected behavior

JUST DOWNLOAD


VM(please complete the following information):

jhanley-com commented 1 month ago

Since you are using a proxy, you will need to do the debugging and create something that clearly indicates a problem with PyTubeFix. You do not even describe the behavior, error messages, etc. We do not have magic crystal balls.

jjikkobro commented 1 month ago

Since you are using a proxy, you will need to do the debugging and create something that clearly indicates a problem with PyTubeFix. You do not even describe the behavior, error messages, etc. We do not have magic crystal balls.

Got it. First, I had to follow the instructions from this link to enable YouTube downloads on my cloud machine: https://github.com/JuanBindez/pytubefix/pull/209.

I used the po_token_generator, written in Python, that you're familiar with to generate and apply the token, but I'm still getting a "detected as bot" error. When I was using youtubeTranscriptApi, I was able to solve the issue by using a socks5 proxy. I tried using both a paid proxy and the po_token in a similar way this time, but I'm still getting the "detected as bot" error.

What I'm wondering is: is it simply not possible to use YouTube download on a cloud machine at all? Or has no one been successful with any method so far?

NannoSilver commented 1 month ago

Since you are using a proxy, you will need to do the debugging and create something that clearly indicates a problem with PyTubeFix. You do not even describe the behavior, error messages, etc. We do not have magic crystal balls.

Got it. First, I had to follow the instructions from this link to enable YouTube downloads on my cloud machine: #209.

I used the po_token_generator, written in Python, that you're familiar with to generate and apply the token, but I'm still getting a "detected as bot" error. When I was using youtubeTranscriptApi, I was able to solve the issue by using a socks5 proxy. I tried using both a paid proxy and the po_token in a similar way this time, but I'm still getting the "detected as bot" error.

What I'm wondering is: is it simply not possible to use YouTube download on a cloud machine at all? Or has no one been successful with any method so far?

Seems the potoken is IP dependent. If you are using a proxy, then the potoken must be for the proxy IP, rather than your VPS cloud machine IP.

Another way is to to use proxy without potoken. For that you have to find a proxy that is not detected as bot.

The so called "residential" proxies have a higher probability that will not be detected as bot. But they are very expensive.

You can also check the proxy at https://ipinfo.io/. If the proxy IP is false for all purple indicators (vpn, tor, proxy, relay, hosting), then there is a good chance it will not be detected as bot. image

But the rule is that there is no rule. Have to keep replacing/changing the proxy until find one that works. And a proxy that is working today may be detected as bot tomorrow.

felipeucelli commented 1 month ago

The main purpose of passing PO Token is to help prevent getting blocked (if on non-DC IP) and allow formats from web clients to work.

If you are being detected as a bot even after passing the PoToken, there are only two alternatives:

  1. Log in with use_oauth (and risk getting your account banned)

  2. Use a proxy to change your IP address.

If even after these attempts you continue to be detected, unfortunately there is nothing we can do.

See some references about PoToken and blocking in cloud applications:

https://github.com/LuanRT/BgUtils/issues/7 https://github.com/iv-org/invidious/issues/4734#issuecomment-2365205990 https://github.com/yt-dlp/yt-dlp/issues/11053

NannoSilver commented 1 month ago

Here is a code to obtain poToken programmatically in python:

https://github.com/iv-org/youtube-trusted-session-generator

I did not tested.

jjikkobro commented 1 month ago

I could download with use_oauth param without any other params. I think that there is no other way to download on cloud machine now.

felipeucelli commented 1 month ago

Maybe I discovered something, I compared the programmatically generated PoToken from some tools and it seems that the one in this repository https://github.com/LuanRT/BgUtils, generates the PoToken differently, they are larger and more similar to the one generated by the official YouTube client.

Here is a code to obtain poToken programmatically in python:

https://github.com/iv-org/youtube-trusted-session-generator

I did not tested.

During my testing, this tool returned fake PoTokens, it seems that if the page doesn't load in time, it doesn't return the correct token.