This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
MIT License
2.54k
stars
279
forks
source link
Transcript "Start" and "Duration" values incorrect #290
DO NOT DELETE THIS! Please take the time to fill this out properly. I am not able to help you if I do not know what you are executing and what error messages you are getting. If you are having problems with a specific video make sure to include the video id.
To Reproduce
Steps to reproduce the behavior: Pull the transcript for this video ID: H_I19q7YKIs (this issues occurs on all the videos I have tested too)
What code / cli command are you executing?
YouTubeTranscriptApi.get_transcript
### Which Python version are you using?
Python 3.12.3
### Which version of youtube-transcript-api are you using?
youtube-transcript-api 1.1.2
# Expected behavior
I expected the "Start" and the "Duration" values to line up correctly, meaning the addition of the start value and the duration will be less than the next transcript value start time, like it is in the example. This isn't the case for all the videos I am testing, which makes creating timestamps for these videos impossible to do based on this data.
# Actual behaviour
When I retrieve the transcript for any YouTube videos, the "start" and "duration" value seem to be incorrect and does not give a proper representation of the timestamps for the video as they overlap.
DO NOT DELETE THIS! Please take the time to fill this out properly. I am not able to help you if I do not know what you are executing and what error messages you are getting. If you are having problems with a specific video make sure to include the video id.
To Reproduce
Steps to reproduce the behavior: Pull the transcript for this video ID: H_I19q7YKIs (this issues occurs on all the videos I have tested too)
What code / cli command are you executing?
YouTubeTranscriptApi.get_transcript
... error message ...