JuanBindez / pytubefix

Python3 library for downloading YouTube Videos.
http://pytubefix.rtfd.io/
MIT License
454 stars 67 forks source link

Can't get video captions #161

Closed RedisOptimal closed 1 month ago

RedisOptimal commented 1 month ago

Describe the bug

https://www.youtube.com/watch?v=pttUWZQWJi8

image

>>> from pytubefix import YouTube
>>> yt=YouTube("https://www.youtube.com/watch?v=pttUWZQWJi8")
>>> yt.title
Traceback (most recent call last):
  File "/root/venv/lib/python3.10/site-packages/pytubefix/__main__.py", line 612, in title
    self._title = self.vid_info['videoDetails']['title']
KeyError: 'videoDetails'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/root/venv/lib/python3.10/site-packages/pytubefix/__main__.py", line 616, in title
    self.check_availability()
  File "/root/venv/lib/python3.10/site-packages/pytubefix/__main__.py", line 310, in check_availability
    raise exceptions.LoginRequired(video_id=self.video_id)
pytubefix.exceptions.LoginRequired: pttUWZQWJi8 requires login to view
>>> yt=YouTube("https://www.youtube.com/watch?v=pttUWZQWJi8", use_oauth=True, allow_oauth_cache=True)
>>> yt.title
'Gov. Tim Walz’s first speech as Kamala Harris’ running mate'
>>> yt.captions
{}
>>> import pytubefix
>>> pytubefix.__version__
'6.9.2'
>>>
sp1d5r commented 1 month ago

Same issue - on latest release :/

JuanBindez commented 1 month ago

Same issue - on latest release :/

It's not a problem with the library, do as he did above

sp1d5r commented 1 month ago

The captions are returning empty even though there are captions on that video?

Previously auto-generated captions were working now returning empty dictionary. Using authentication with oauth

yt.captions returns {} for any video with auto-generated captions... The example above has actual captions, but is returning empty {}

Should mention this happens irregularly. I'm using a proxy service so perhaps that's why - but unsure.

RedisOptimal commented 2 weeks ago

The captions are returning empty even though there are captions on that video?

Previously auto-generated captions were working now returning empty dictionary. Using authentication with oauth

  • video and audio downloading works as expected with authentication

yt.captions returns {} for any video with auto-generated captions... The example above has actual captions, but is returning empty {}

Should mention this happens irregularly. I'm using a proxy service so perhaps that's why - but unsure.

I don't think it's proxy service problem. I thought I find the root cause of this problem. Google's SDE remove the captions information in interface /v1/player. I debug the code, and step in around ( https://github.com/JuanBindez/pytubefix/blob/c8b75ac0a16bea702a9eef371778257e8364c292/pytubefix/innertube.py#L648 ) check the play function return data. Even though google obtain the offical data api: GET https://www.googleapis.com/youtube/v3/captions?part=snippet&videoId= GET https://www.googleapis.com/youtube/v3/captions/CAPTION_ID?tfmt=srt&key= The second download captions api always give me a 404.