JuanBindez / pytubefix

Python3 library for downloading YouTube Videos.
http://pytubefix.rtfd.io/
MIT License
454 stars 67 forks source link

Added PoToken support #209

Closed felipeucelli closed 1 week ago

felipeucelli commented 1 week ago

Added PoToken support

The proof of origin (PO) token is a parameter that YouTube requires to be sent with video playback requests from some clients. Without it, format URL requests from affected customers may return HTTP error 403, error with bot detection, or result in your account or IP address being blocked.

This token is generated by BotGuard (Web) / DroidGuard (Android) to attest the requests are coming from a genuine client.

The generation of PoToken by botguard involves several obfuscated functions created dynamically, which can have misleading ramifications, and is almost impossible to replicate in Python.

Using PoToken

As it is not an easy task to generate the PoToken in Python, we will have to resort to external resources

Manually acquiring a PO Token from a browser for use when logged out

This process involves manually obtaining a PO token generated from YouTube in a web browser and then manually passing it to pytubefix via the use_po_token=True argument. Steps:

  1. Open a browser and go to any video on YouTube Music or YouTube Embedded (e.g. https://www.youtube.com/embed/aqz-KE-bpKQ). Make sure you are not logged in to any account!

  2. Open the developer console (F12), then go to the "Network" tab and filter by v1/player

  3. Click the video to play and a player request will appear in the network tab

  4. In the request payload JSON, find the PO Token at serviceIntegrityDimensions.poToken and save that value

  5. In the request payload JSON, find the visitorData at context.client.visitorData and save that value

  6. In the pytubefix code, pass the parameter use_po_token=True, to send the visitorData and PoToken:

We can also use some scripts that speed up this process, for example: https://github.com/YunzheZJU/youtube-po-token-generator or any other script of your choice

After obtaining the PoToken and the linked visitorData, you can send them to pytubefix similar to the oauth method:

from pytubefix import YouTube

url = input("url >")

yt = YouTube(url, use_po_token=True)
print(yt.title)

ys = yt.streams.get_highest_resolution()
ys.download()

This will ask the terminal to input visitorData and PoToken respectively.

You can also customize the function that inserts visitorData and PoToken, just pass it using po_token_verifier, similar to #190

If allow_oauth_cache (we should change the name to allow_cache, but let's maintain compatibility) is True, the visitorData and PoToken will be cached and reused in the next requests (They may expire).

Clients affected with PoToken

Currently ANDROID and WEB clients need PoToken to obtain functional streams, otherwise only progressive streams in 360p will work.

As the PoToken generated by BotGuard "is easier to get", pytubefix will default to the WEB client when using use_po_token=True.

To use the affected clients, we have to send the visitorData and PoToken via the API and then add the PoToken to each stream with the query pot parameter.

Observation

Although it is possible to use use_po_token together with use_oauth, it is not a good idea, as YouTube can track you more easily.

JuanBindez commented 1 week ago

it is in the PoToken branch and is available on pypi, pytubefix==6.15a1