Open KarenPHS opened 3 weeks ago
Hi @KarenPHS, I cannot replicate that error. Does that happen for every video or only SeXZt5hqe6I
?
No, I tried. But it happened at least once when I downloaded captions from videos.
import urllib.request
from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api._errors import NoTranscriptFound, TranscriptsDisabled
from xml.etree.ElementTree import ParseError
import json
base_video_url = 'https://www.youtube.com/watch?v='
base_search_url = 'https://www.googleapis.com/youtube/v3/search?'
API_KEY=''
channel_id = 'UCKSVUHI9rbbkXhvAXK-2uxA'
first_url = base_search_url + 'key={}&channelId={}&part=snippet,id&order=date&maxResults=25'.format(API_KEY, channel_id)
video_links = []
url = first_url
# download all video links from a channel
while True:
inp = urllib.request.urlopen(url)
resp = json.load(inp)
for i in resp['items']:
if i['id']['kind'] == "youtube#video":
video_links.append(base_video_url + i['id']['videoId'])
try:
next_page_token = resp['nextPageToken']
url = first_url + '&pageToken={}'.format(next_page_token)
except KeyError:
break
# download all captions from all videos
for url in video_links:
url_id = url.split('watch?v=')[-1]
while True:
try:
source = YouTubeTranscriptApi.list_transcripts(url_id)
en_caption = source.find_transcript(['en']).fetch()
break
except (KeyError, NoTranscriptFound, TranscriptsDisabled):
print('No captions there', url_id)
break
except ParseError:
print('ParseError. there is a caption in', url, ', so, try again')
So it doesn't happen consistently for SeXZt5hqe6I
, but just randomly happened once?
Hello, I have the same problem. For around 200 videos, I catch this error around 3-5 times every time, never the same ids.
Traceback (most recent call last):
File "/home/araule/Documents/corpus/scripts/get_videos.py", line 375, in get_transcripts
res = YouTubeTranscriptApi.get_transcript(video_id, languages=['fr'])
File "/home/araule/miniconda3/envs/youtube/lib/python3.10/site-packages/youtube_transcript_api/_api.py", line 137, in get_transcript
return cls.list_transcripts(video_id, proxies, cookies).find_transcript(languages).fetch(preserve_formatting=preserve_formatting)
File "/home/araule/miniconda3/envs/youtube/lib/python3.10/site-packages/youtube_transcript_api/_transcripts.py", line 292, in fetch
return _TranscriptParser(preserve_formatting=preserve_formatting).parse(
File "/home/araule/miniconda3/envs/youtube/lib/python3.10/site-packages/youtube_transcript_api/_transcripts.py", line 358, in parse
for xml_element in ElementTree.fromstring(plain_data)
File "/home/araule/miniconda3/envs/youtube/lib/python3.10/xml/etree/ElementTree.py", line 1348, in XML
return parser.close()
xml.etree.ElementTree.ParseError: no element found: line 1, column 0
I use Python 3.10.14 and youtube-transcript-api 0.6.2 (downloaded with pip).
So it doesn't happen consistently for
SeXZt5hqe6I
, but just randomly happened once?
Yes, it randomly happened, but more than once.
I got the same issue using Python 3.11.3 using youtube-transcript-api 0.6.2. And also note that happens randomly, when I retried it ended up working.
ERROR:root:no element found: line 1, column 0, 3q67v12M31M ERROR:root:no element found: line 1, column 0, McRUxBHgFIo ERROR:root:no element found: line 1, column 0, mokGJiXVw_4
DO NOT DELETE THIS! Please take the time to fill this out properly. I am not able to help you if I do not know what you are executing and what error messages you are getting. If you are having problems with a specific video make sure to include the video id.
To Reproduce
Steps to reproduce the behavior:
What code / cli command are you executing?
For example: I am running
Which Python version are you using?
Python 3.6.4
Which version of youtube-transcript-api are you using?
youtube-transcript-api 0.6.2
Expected behavior
Describe what you expected to happen.
For example: I expected to receive the english transcript
Actual behaviour
Describe what is happening instead of the Expected behavior. Add error messages if there are any.
For example: Instead I received the following error message: