Closed eyedu closed 1 year ago
Hi @eyedu,
please elaborate on what exactly you are executing and what the output is. This is a python module and there is no get_caption
function.
Hi, Thanks for getting back to me.
I was trying to run the package in R for any videos with chinese subtitles and I always get the same error message.
I am sure that the video comes with caption but I just don't know why it didn't work. I try to use python to do the same action but it didn't work as well.
Please see my R code below.
url <- ("https://www.youtube.com/watch?v=4HLSBvlv0Ug&t=85s") caption <- get_caption(url) Error: youtube_transcript_api._errors.NoTranscriptFound: <... omitted ...>omali")
- st ("Southern Sotho")
- es ("Spanish")
- su ("Sundanese")
- sw ("Swahili")
- sv ("Swedish")
- tg ("Tajik")
- ta ("Tamil")
- tt ("Tatar")
- te ("Telugu")
- th ("Thai")
- ti ("Tigrinya")
- ts ("Tsonga")
- tr ("Turkish")
- tk ("Turkmen")
- uk ("Ukrainian")
- ur ("Urdu")
- ug ("Uyghur")
- uz ("Uzbek")
- vi ("Vietnamese")
- cy ("Welsh")
- fy ("Western Frisian")
- xh ("Xhosa")
- yi ("Yiddish")
- yo ("Yoruba")
- zu ("Zulu")
If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem!
See reticulate::py_last_error()
for details
I am still not sure what get_caption
does, but seeing that code I would assume that you are using the url as a video id. The video id for https://www.youtube.com/watch?v=4HLSBvlv0Ug&t=85s
is 4HLSBvlv0Ug
.
Does that solve your problem?
Hi Jonas, thanks for getting back to me, I used python re-run the code and this is what I got.
from youtube_transcript_api import YouTubeTranscriptApi
YouTubeTranscriptApi.get_transcript("4HLSBvlv0Ug")
NoTranscriptFound Traceback (most recent call last)
<ipython-input-3-fcae0a96a7d8> in <module>
1 from youtube_transcript_api import YouTubeTranscriptApi
2
----> 3 YouTubeTranscriptApi.get_transcript("4HLSBvlv0Ug")
~/opt/anaconda3/lib/python3.8/site-packages/youtube_transcript_api/_api.py in get_transcript(cls, video_id, languages, proxies, cookies)
130 """
131 assert isinstance(video_id, str), "`video_id` must be a string"
--> 132 return cls.list_transcripts(video_id, proxies, cookies).find_transcript(languages).fetch()
133
134 @classmethod
~/opt/anaconda3/lib/python3.8/site-packages/youtube_transcript_api/_transcripts.py in find_transcript(self, language_codes)
177 :raises: NoTranscriptFound
178 """
--> 179 return self._find_transcript(language_codes, [self._manually_created_transcripts, self._generated_transcripts])
180
181 def find_generated_transcript(self, language_codes):
~/opt/anaconda3/lib/python3.8/site-packages/youtube_transcript_api/_transcripts.py in _find_transcript(self, language_codes, transcript_dicts)
213 return transcript_dict[language_code]
214
--> 215 raise NoTranscriptFound(
216 self.video_id,
217 language_codes,
NoTranscriptFound:
Could not retrieve a transcript for the video https://www.youtube.com/watch?v=4HLSBvlv0Ug! This is most likely caused by:
No transcripts were found for any of the requested language codes: ('en',)
For this video (4HLSBvlv0Ug) transcripts are available in the following languages:
(MANUALLY CREATED)
- zh-TW ("Chinese (Taiwan)")[TRANSLATABLE]
(GENERATED)
None
(TRANSLATION LANGUAGES)
- af ("Afrikaans")
- ak ("Akan")
- sq ("Albanian")
- am ("Amharic")
- ar ("Arabic")
- hy ("Armenian")
- as ("Assamese")
- ay ("Aymara")
- az ("Azerbaijani")
- bn ("Bangla")
- eu ("Basque")
- be ("Belarusian")
- bho ("Bhojpuri")
- bs ("Bosnian")
- bg ("Bulgarian")
- my ("Burmese")
- ca ("Catalan")
- ceb ("Cebuano")
- zh-Hans ("Chinese (Simplified)")
- zh-Hant ("Chinese (Traditional)")
- co ("Corsican")
- hr ("Croatian")
- cs ("Czech")
- da ("Danish")
- dv ("Divehi")
- nl ("Dutch")
- en ("English")
- eo ("Esperanto")
- et ("Estonian")
- ee ("Ewe")
- fil ("Filipino")
- fi ("Finnish")
- fr ("French")
- gl ("Galician")
- lg ("Ganda")
- ka ("Georgian")
- de ("German")
- el ("Greek")
- gn ("Guarani")
- gu ("Gujarati")
- ht ("Haitian Creole")
- ha ("Hausa")
- haw ("Hawaiian")
- iw ("Hebrew")
- hi ("Hindi")
- hmn ("Hmong")
- hu ("Hungarian")
- is ("Icelandic")
- ig ("Igbo")
- id ("Indonesian")
- ga ("Irish")
- it ("Italian")
- ja ("Japanese")
- jv ("Javanese")
- kn ("Kannada")
- kk ("Kazakh")
- km ("Khmer")
- rw ("Kinyarwanda")
- ko ("Korean")
- kri ("Krio")
- ku ("Kurdish")
- ky ("Kyrgyz")
- lo ("Lao")
- la ("Latin")
- lv ("Latvian")
- ln ("Lingala")
- lt ("Lithuanian")
- lb ("Luxembourgish")
- mk ("Macedonian")
- mg ("Malagasy")
- ms ("Malay")
- ml ("Malayalam")
- mt ("Maltese")
- mi ("Māori")
- mr ("Marathi")
- mn ("Mongolian")
- ne ("Nepali")
- nso ("Northern Sotho")
- no ("Norwegian")
- ny ("Nyanja")
- or ("Odia")
- om ("Oromo")
- ps ("Pashto")
- fa ("Persian")
- pl ("Polish")
- pt ("Portuguese")
- pa ("Punjabi")
- qu ("Quechua")
- ro ("Romanian")
- ru ("Russian")
- sm ("Samoan")
- sa ("Sanskrit")
- gd ("Scottish Gaelic")
- sr ("Serbian")
- sn ("Shona")
- sd ("Sindhi")
- si ("Sinhala")
- sk ("Slovak")
- sl ("Slovenian")
- so ("Somali")
- st ("Southern Sotho")
- es ("Spanish")
- su ("Sundanese")
- sw ("Swahili")
- sv ("Swedish")
- tg ("Tajik")
- ta ("Tamil")
- tt ("Tatar")
- te ("Telugu")
- th ("Thai")
- ti ("Tigrinya")
- ts ("Tsonga")
- tr ("Turkish")
- tk ("Turkmen")
- uk ("Ukrainian")
- ur ("Urdu")
- ug ("Uyghur")
- uz ("Uzbek")
- vi ("Vietnamese")
- cy ("Welsh")
- fy ("Western Frisian")
- xh ("Xhosa")
- yi ("Yiddish")
- yo ("Yoruba")
- zu ("Zulu")
If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem!
Hi @eyedu, I think the error message is fairly descriptive here: as you did not specify which language you want, the English transcript is requested, however, there is no English transcript. Just add the language you want: YouTubeTranscriptApi.get_transcript("4HLSBvlv0Ug", languages=['zh-TW'])
I am using R to perform the function get_caption.
Tried the code on numerous url links but none of it works on videos with chinese subtitle. I am sure that the subtitle is available in all this videos.
https://www.youtube.com/watch?v=4HLSBvlv0Ug https://www.youtube.com/watch?v=oE0yPwT-c3Q&t=1s https://www.youtube.com/watch?v=uIqegdIwtW0&t=72s