jdepoix / youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
MIT License
2.87k stars 326 forks source link

Error 'Subtitles are disabled for this video' on youtube_transcript_api cli #174

Closed eddieoz closed 1 year ago

eddieoz commented 1 year ago

To Reproduce

Execute cli $ youtube_transcript_api https://www.youtube.com/watch?v=--t7xdzBiQ4

Which Python version are you using?

python 3.10.6, 3.9.0 and 3.8.8

Which version of youtube-transcript-api are you using?

0.5.0

What code / cli command are you executing?

$ youtube_transcript_api https://www.youtube.com/watch?v=--t7xdzBiQ4

Expected behavior

Show transcript

Actual behaviour

Could not retrieve a transcript for the video https://www.youtube.com/watch?v=https://www.youtube.com/watch?v=--t7xdzBiQ4! This is most likely caused by:

Subtitles are disabled for this video

If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem!

I tried on different machines and different IPs, but same error using the cli. This video was working before and I used it as a reference because I already downloaded the transcript of it.

When using a piece of code, it shows a different message but I believe it is related. I tried on different machines and also different IPs:

transcript = YouTubeTranscriptApi.get_transcript('--t7xdzBiQ4', languages=['pt'])
print (transcript)

Error

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/youtube_transcript_api/_transcripts.py", line 33, in _raise_http_errors
    response.raise_for_status()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://www.google.com/sorry/index?continue=https://www.youtube.com/watch%3Fv%3D--t7xdzBiQ4&q=EhAqAX4BAAAAAPA8k__-DygrGOmNk5wGIjAa21GTvhu3UKzoT46i6CEC7Fz0sJ1Gl6wYXLfYlCAOSKUlP1_IMKDreZewrRBSpX4yAXI

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/Projects/youtube-transcripts/test.py", line 20, in <module>
    transcript = YouTubeTranscriptApi.get_transcript('--t7xdzBiQ4', languages=['pt'])
  File "/usr/local/lib/python3.10/dist-packages/youtube_transcript_api/_api.py", line 132, in get_transcript
    return cls.list_transcripts(video_id, proxies, cookies).find_transcript(languages).fetch()
  File "/usr/local/lib/python3.10/dist-packages/youtube_transcript_api/_api.py", line 71, in list_transcripts
    return TranscriptListFetcher(http_client).fetch(video_id)
  File "/usr/local/lib/python3.10/dist-packages/youtube_transcript_api/_transcripts.py", line 47, in fetch
    self._extract_captions_json(self._fetch_video_html(video_id), video_id)
  File "/usr/local/lib/python3.10/dist-packages/youtube_transcript_api/_transcripts.py", line 79, in _fetch_video_html
    html = self._fetch_html(video_id)
  File "/usr/local/lib/python3.10/dist-packages/youtube_transcript_api/_transcripts.py", line 89, in _fetch_html
    return unescape(_raise_http_errors(response, video_id).text)
  File "/usr/local/lib/python3.10/dist-packages/youtube_transcript_api/_transcripts.py", line 36, in _raise_http_errors
    raise YouTubeRequestFailed(error, video_id)
youtube_transcript_api._errors.YouTubeRequestFailed: 
Could not retrieve a transcript for the video https://www.youtube.com/watch?v=429 Client Error: Too Many Requests for url: https://www.google.com/sorry/index?continue=https://www.youtube.com/watch%3Fv%3D--t7xdzBiQ4&q=EhAqAX4BAAAAAPA8k__-DygrGOmNk5wGIjAa21GTvhu3UKzoT46i6CEC7Fz0sJ1Gl6wYXLfYlCAOSKUlP1_IMKDreZewrRBSpX4yAXI! This is most likely caused by:

Request to YouTube failed: --t7xdzBiQ4

If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem!
jdepoix commented 1 year ago

Hi @eddieoz, your CLI command is failing because you have to provide the ID of the video, not the URL. Also, your video's ID starts with - so you'll have to escape that. Additionally, the video doesn't have an English transcript (which is the default if no language is provided), therefore, you'll have to provide the language you want to get the transcript for. Try this: youtube_transcript_api "\--t7xdzBiQ4" --languages pt

The script you posted is failing because you have exceeded YouTube's rate limit, as the error message suggests. There unfortunately is not much you can do here, except for waiting until they remove the IP ban on you.

eddieoz commented 1 year ago

Thanks for your reply.

I will review it and run the bot under tor or another VPN to update my channel's transcripts db. Unfortunately seems Youtube listed my 3 IPs.

Best regards,