jdepoix / youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
MIT License
2.54k stars 279 forks source link

Parse error on video with auto-generated subtitles #259

Open PetrKubes97 opened 4 months ago

PetrKubes97 commented 4 months ago

To Reproduce

Steps to reproduce the behavior:

I am running

transcripts = YouTubeTranscriptApi.list_transcripts(video_url)
transcript = transcripts.find_transcript(language_codes=[desired_language])

if transcript is None:
    return 'no transcript was found', 400

downloaded = transcript.fetch()

Which Python version are you using?

Python 3.9.6

Which version of youtube-transcript-api are you using?

youtube-transcript-api 0.6.2

Expected behavior

Download transcript of the video id 26Ac1VJgd64

Actual behaviour

I get an error

For example: Instead I received the following error message:

    downloaded = transcript.fetch()
  File "/Library/Python/3.9/site-packages/youtube_transcript_api/_transcripts.py", line 292, in fetch
    return _TranscriptParser(preserve_formatting=preserve_formatting).parse(
  File "/Library/Python/3.9/site-packages/youtube_transcript_api/_transcripts.py", line 358, in parse
    for xml_element in ElementTree.fromstring(plain_data)
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/xml/etree/ElementTree.py", line 1348, in XML
    return parser.close()
xml.etree.ElementTree.ParseError: no element found: line 1, column 0