jdepoix / youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
MIT License
2.54k stars 279 forks source link

Can't obtain the subtitle for this video #286

Closed nightmare233 closed 1 month ago

nightmare233 commented 1 month ago

DO NOT DELETE THIS! Please take the time to fill this out properly. I am not able to help you if I do not know what you are executing and what error messages you are getting. If you are having problems with a specific video make sure to include the video id.

To Reproduce

video link: Upgrading Your Sprinkler Timer image

What code / cli command are you executing?

I am running

def get_single_youtube_subtitle(video_id):  
    try:
        subtitle_content = ''
        transcript = YouTubeTranscriptApi.get_transcript(video_id, languages=['en'])

        for item in transcript:
            text = item['text'].strip()
            if not text:
                continue
            start_time = item['start']
            subtitle_content += f"[{start_time:.3f}] {text} "
        subtitle_content = subtitle_content.strip()
        print(sys.getsizeof(subtitle_content))
        return subtitle_content
    except Exception as e:
        print("Error occurred when get subtitle:",e)
        return ''

Which Python version are you using?

python 3.12.3 (tags/v3.12.3:f6650f9, Apr 9 2024, 14:05:25)

Which version of youtube-transcript-api are you using?

youtube_transcript_api==0.6.2

Expected behavior

I expected to receive the english transcript

Actual behaviour

exception as following.

For example: Instead I received the following error message:

# ... error message ...
'No transcripts were found for any of the requested language codes: (\'en\',)\n\nFor this video (RAvsQFY388U) transcripts are available in the following languages:\n\n(MANUALLY CREATED)\n - en-US ("English (United States)")[TRANSLATABLE]\n\n(GENERATED)\nNone\n\n(TRANSLATION LANGUAGES)\n - af ("Afrikaans")\n - ak ("Akan")\n - sq ("Albanian")\n - am ("Amharic")\n - ar ("Arabic")\n - hy ("Armenian")\n - as ("Assamese")\n - ay ("Aymara")\n - az ("Azerbaijani")\n - bn ("Bangla")\n - eu ("Basque")\n - be ("Belarusian")\n - bho ("Bhojpuri")\n - bs ("Bosnian")\n - bg ("Bulgarian")\n - my ("Burmese")\n - ca ("Catalan")\n - ceb ("Cebuano")\n - zh-Hans ("Chinese (Simplified)")\n - zh-Hant ("Chinese (Traditional)")\n - co ("Corsican")\n - hr ("Croatian")\n - cs ("Czech")\n - da ("Danish")\n - dv ("Divehi")\n - nl ("Dutch")\n - en ("English")\n - eo ("Esperanto")\n - et ("Estonian")\n - ee ("Ewe")\n - fil ("Filipino")\n - fi ("Finnish")\n - fr ("French")\n - gl ("Galician")\n - lg ("Ganda")\n - ka ("Georgian")\n - de ("German")\n - el ("Greek")\n - gn ("Guarani")\n - gu ("Gujarati")\n - ht ("Haitian Creole")\n - ha ("Hausa")\n - haw ("Hawaiian")\n - iw ("Hebrew")\n - hi ("Hindi")\n - hmn ("Hmong")\n - hu ("Hungarian")\n - is ("Icelandic")\n - ig ("Igbo")\n - id ("Indonesian")\n - ga ("Irish")\n - it ("Italian")\n - ja ("Japanese")\n - jv ("Javanese")\n - kn ("Kannada")\n - kk ("Kazakh")\n - km ("Khmer")\n - rw ("Kinyarwanda")\n - ko ("Korean")\n - kri ("Krio")\n - ku ("Kurdish")\n - ky ("Kyrgyz")\n - lo ("Lao")\n - la ("Latin")\n - lv ("Latvian")\n - ln ("Lingala")\n - lt ("Lithuanian")\n - lb ("Luxembourgish")\n - mk ("Macedonian")\n - mg ("Malagasy")\n - ms ("Malay")\n - ml ("Malayalam")\n - mt ("Maltese")\n - mi ("Māori")\n - mr ("Marathi")\n - mn ("Mongolian")\n - ne ("Nepali")\n - nso ("Northern Sotho")\n - no ("Norwegian")\n - ny ("Nyanja")\n - or ("Odia")\n - om ("Oromo")\n - ps ("Pashto")\n - fa ("Persian")\n - pl ("Polish")\n - pt ("Portuguese")\n - pa ("Punjabi")\n - qu ("Quechua")\n - ro ("Romanian")\n - ru ("Russian")\n - sm ("Samoan")\n - sa ("Sanskrit")\n - gd ("Scottish Gaelic")\n - sr ("Serbian")\n - sn ("Shona")\n - sd ("Sindhi")\n - si ("Sinhala")\n - sk ("Slovak")\n - sl ("Slovenian")\n - so ("Somali")\n - st ("Southern Sotho")\n - es ("Spanish")\n - su ("Sundanese")\n - sw ("Swahili")\n - sv ("Swedish")\n - tg ("Tajik")\n - ta ("Tamil")\n - tt ("Tatar")\n - te ("Telugu")\n - th ("Thai")\n - ti ("Tigrinya")\n - ts ("Tsonga")\n - tr ("Turkish")\n - tk ("Turkmen")\n - uk ("Ukrainian")\n - ur ("Urdu")\n - ug ("Uyghur")\n - uz ("Uzbek")\n - vi ("Vietnamese")\n - cy ("Welsh")\n - fy ("Western Frisian")\n - xh ("Xhosa")\n - yi ("Yiddish")\n - yo ("Yoruba")\n - zu ("Zulu")'
jdepoix commented 1 month ago

Hi @nightmare233, you're trying to retrieve a en transcript, however, there is none. It might seem misleading, but the transcript you're looking for is en-US not en.