jdepoix / youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
MIT License
2.55k stars 280 forks source link

Transcript fails to load, no 'en', but 'en-GB' (selectable would be nice) #200

Closed mcint closed 1 year ago

mcint commented 1 year ago

Transcript fails to load, no 'en', but 'en-GB' (selectable would be nice).

To Reproduce

Steps to reproduce the behavior:

https://youtubetranscript.com/?v=9-jIplX6Wjw

Expected behavior

I did expect to receive the english transcript.

Actual behaviour*

Transcript won't load.

Instead I received the following error message: :( Unknown error: Could not retrieve a transcript for the video http://www.youtube.com/watch?v=9-jIplX6Wjw! This is most likely caused by: No transcripts were found for any of the requested language codes: ('en',) For this video (9-jIplX6Wjw) transcripts are available in the following languages: (MANUALLY CREATED) - en-GB ("English (United Kingdom)")[TRANSLATABLE] - fr ("French")[TRANSLATABLE] - de ("German")[TRANSLATABLE] - it ("Italian")

jdepoix commented 1 year ago

Hi @mcint, could you please elaborate on what you think the error is here and what you mean by "selectable would be nice"?

I am not sure what you are executing, since you didn't provide any code, but I would assume you are trying to retrieve the transcript for 9-jIplX6Wjw without specifying which language you want. This will default to the language code en. This means that you requested a transcript for the language code en, but there is none. Therefore, the error seems appropriate to me. This will work fine if you request the transcript for en-GB.

Since there can be multiple transcripts with different English dialects on a single video, we cannot simply fallback to any of them in case there is no en transcript, as this would require this module to implicitly select some over others. As a user of this module, you can choose to do so by doing something like YouTubeTranscriptApi.get_transcript('9-jIplX6Wjw', languages=['en', 'en-GB']).

jdepoix commented 1 year ago

Hi @mcint, did this solve your problem? Can I close this issue?

jdepoix commented 1 year ago

Closed due to inactivity.