jdepoix / youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
MIT License
2.87k stars 326 forks source link

Transcriptions of shorts videos are outsync #163

Closed titusfx closed 1 year ago

titusfx commented 2 years ago

I'm being using a script for getting transcript working 100% until today, that I tried to download a short. The time was quite off.

https://www.youtube.com/watch?v=uCbpDW0p0Gs

Thanks

jdepoix commented 2 years ago

Hi @titusfx, could you please clarify what exactly is not working for you? I can't find anything wrong with the video you posted using the current version of youtube-transcript-api.

jdepoix commented 1 year ago

Any news on this @titusfx? Otherwise, I will close this issue.

titusfx commented 1 year ago

Hi @jdepoix the same link is https://www.youtube.com/shorts/uCbpDW0p0Gs as you can see the format is different, and in order to work I need to manually do:

  1. Convert this format to the other (from https://www.youtube.com/shorts/uCbpDW0p0Gs to https://www.youtube.com/watch?v=uCbpDW0p0Gs)
  2. Watch a part of the video (open the link, only it won't work)
  3. And then the video transcriptions/translation are available on the library

So, you need to find a short video that wasn't opened with the equivalent link.

I'm using the last version of youtube-transcript-api

jdepoix commented 1 year ago

Hi @titusfx, I have no experience with getting transcripts for shorts and since shorts aren't really supported by this module yet, this sounds more like a feature request than a bug. What you are describing seems to indicate that YouTube generates the transcript lazily only after the video has been opened using a https://www.youtube.com/watch.... url. In that case there is no way for this module to work around that. But you could write some code to call the watch url first and then wait until the transcript is available.