jdepoix / youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
MIT License
2.55k stars 280 forks source link

Is there currently a way to return just the list of language_codes from manual and auto generated videos? #201

Closed wes-kay closed 1 year ago

wes-kay commented 1 year ago

Looking through the api I wasn't able to see anything that would handle returning back auto generated as well as manual language_codes, this is the current solution I've come up with:

def get_languages(video_id):
    try:
        transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
        if len(transcript_list._generated_transcripts) == 0:
            values_list = list(transcript_list._manually_created_transcripts.values())
        else:
            values_list = list(transcript_list._generated_transcripts.values())

        return [item.language_code for item in values_list]

    except Exception as e:
        return None

Please let me know if there's a better way.

jdepoix commented 1 year ago

Hi @wes-kay, the TranscriptList class is iterable. So you can simply do [transcript.language_code for transcript in YouTubeTranscriptApi.list_transcripts(video_id)] to iterate over all transcripts.