jdepoix / youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
MIT License
2.87k stars 326 forks source link

Tips on avoiding blockage from YouTube #157

Closed timy-16 closed 2 years ago

timy-16 commented 2 years ago

Hello!

I was wondering if there are any tips that you can give me to avoid getting blocked from YouTube servers by sending a lot requests. I know that using proxies and sleep function would be of some help. But maybe there are some other methods that I'm not aware of.

And can you please answer to these questions:

1) For many videos I can realistically get transcripts in a day? What is the maximum? 2) Does each request comes at the cost of quota of 10000 units that YouTube provides daily?(I know this question may come off as a bit silly, but just to make sure) 3) When using proxies, does the function get_transcripts change proxies randomly or sequentially? 4) And what would be the ideal sleep time to use?

jdepoix commented 2 years ago

Hi @timy-16, unfortunately, there are no other ways to work around the ban than what is described in the error message. Unfortunately, I also don't have an answer to any of the questions you asked. In my experience (and what others have reported) YouTube's blocking behaviour is very inconsistent and I would guess that it depends on their current load. Therefore, there is no way to anticipate / work around it. Regarding 3.: There currently is no functionality implemented to use a pool of proxies (although that's somethings I have on the TODO list).