This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
MIT License
2.54k
stars
279
forks
source link
Some YouTube videos would still not work with transcript api despite executing a simple fix #253
First I will like to say thank you for this amazing project but yeah it seems sort of flawed or maybe I'm the one being confused, that's what I would honestly love to find out.
now with the issue:
To Reproduce
Steps to reproduce the behavior:
Use the languages=en (or parameter however equivalent)
try different videos (below are examples ive tried)
try:
initial_transcript = ytapi.get_transcript(f"{youtube_video_id}", languages=['en'])
transcript = """"""
for transcript_retrieval in range(0, len(initial_transcript)):
transcript += initial_transcript[transcript_retrieval]['text'] + "\n"
except youtube_transcript_api.TranscriptsDisabled:
print("Seems like transcripts/subtitles are disabled via API or the video itself or they havent been generated, you can try manually inserting the transcript if there's a case of that")
transcript = input("""> """)
Which Python version are you using?
Python 3.9 (3.11 for WSL)
Which version of youtube-transcript-api are you using?
youtube-transcript-api 0.6.2
Expected behavior
As always, I expect both to return the "en" language transcript (so english or english auto-generated)
Actual behaviour
Instead One video retrieves the transcript while the other doesn't. Error message cycles between TranscriptDisabled or TranscriptNotFound.
as always I did check that those subtitles exist, sometimes even in the "not found" errors it will say 'en' is available while thats the exact parameter I've set
sorry I've forgot to preface it but these videos are not age restricted or anything that could I guess indicate for it to stop...
First I will like to say thank you for this amazing project but yeah it seems sort of flawed or maybe I'm the one being confused, that's what I would honestly love to find out. now with the issue:
To Reproduce
Steps to reproduce the behavior:
What code / cli command are you executing?
Per my brand new project
Which Python version are you using?
Python 3.9 (3.11 for WSL)
Which version of youtube-transcript-api are you using?
youtube-transcript-api 0.6.2
Expected behavior
As always, I expect both to return the "en" language transcript (so english or english auto-generated)
Actual behaviour
Instead One video retrieves the transcript while the other doesn't. Error message cycles between TranscriptDisabled or TranscriptNotFound. as always I did check that those subtitles exist, sometimes even in the "not found" errors it will say 'en' is available while thats the exact parameter I've set sorry I've forgot to preface it but these videos are not age restricted or anything that could I guess indicate for it to stop...