jdepoix / youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
MIT License
2.54k stars 279 forks source link

Error: Could not Retrieve a Transcript | Although they are enabled #296

Closed SDA-Service closed 1 week ago

SDA-Service commented 1 week ago

DO NOT DELETE THIS! Please take the time to fill this out properly. I am not able to help you if I do not know what you are executing and what error messages you are getting. If you are having problems with a specific video make sure to include the video id.

To Reproduce

Steps to reproduce the behavior: I used a custom action call through a Open Ai custom gpt action schema. Using the get_transcript function. I gave it a list of youtube urls for testing. I gave it 9 different URLS to process, 7 of which gave back the error: Error: "Could not retrieve a transcript for the video https://www.youtube.com/watch?v={VIDEO_ID}! This is most likely caused by:

Subtitles are disabled for this video..." Intrigued of why this occurred, because this is the first time throughout my testing that this response was given. I checked out each video ID it stated had captions disabled, only to find all of them have auto-generated captions available.

Here are the IDs that gave this error:

[S9GLJiakHxE] [K_sOBVPKRK0]) [CD0rWlYdofU] [MgAIrGxnN-8] [fmQi-Nc4_eA] [N_D_BPUICpQ] [Rv8QnxgR-ww]

What code / cli command are you executing?

For example: I am running

YouTubeTranscriptApi.get_transcript ...

Which Python version are you using?

Python 3.9

Which version of youtube-transcript-api are you using?

youtube-transcript-api==0.4.4

Expected behavior

Describe what you expected to happen. I have had success in using this method multiple times, but I wanted to attempt a batch test as well, then ran into this I expected the Ai to be able to read the transcript as it normally would, although it never successfully acquired the transcript text due to the error

Actual behaviour

When executing the fetch action it retrieved 2 out of the 9 transcripts and gave back the response to the 2 that worked but then said that the other 7 encountered the error stated above

For example: Instead I received the following error message:

Error: Could not retrieve a transcript for the video https://www.youtube.com/watch?v={VIDEO_ID}! This is most likely caused by:

Subtitles are disabled for this video

If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem!

jdepoix commented 1 week ago

Hi @SDA-Service, did you adjust the error message? Because it says

Could not retrieve a transcript for the video https://www.youtube.com/watch?v={VIDEO_ID}

Since there is {VIDEO_ID} in there instead of the video ID, it seems to me like the video IDs are not correctly passed to the module. I would assume that there is a bug somewhere in the code integrating this module and not in the module itself.

SDA-Service commented 1 week ago

Hey @jdepoix , yes so that code snip was auto adjusted by the gpt when giving back the error message, so a video id was entered with the call. Although I just did some alterations on my side, using the newest release of the api, and adjusting the runtime to 3.8 instead of 3.9 python. Then retested the previous IDs that gave the error and it is seemingly fixed. Thank you