JuanBindez / pytubefix

Python3 library for downloading YouTube Videos.
http://pytubefix.rtfd.io/
MIT License
454 stars 67 forks source link

AttributeError: 'str' object has no attribute 'get' when using fiftyone with pytubefix #189

Closed pseudotensor closed 1 week ago

pseudotensor commented 4 weeks ago

Using fiftyone and replacing all cases with pytube fix:

pip install fiftyone pytubefix

sp=`python3.10 -c 'import site; print(site.getsitepackages()[0])'`
sed -i 's/Pytube/PytubeFix/g'  $sp/fiftyone/utils/youtube.py
sed -i 's/pytube>=15/pytube>=6/g' $sp/fiftyone/utils/youtube.py
sed -i 's/pytube/pytubefix/g' $sp/fiftyone/utils/youtube.py

running:

urls = ["https://www.youtube.com/shorts/fRkZCriQQNU"]
download_dir = './'

if urls:
    import fiftyone.utils.youtube as fouy
    fouy.download_youtube_videos(urls, download_dir=download_dir)

# Create a FiftyOne Dataset
import fiftyone as fo
dataset = fo.Dataset.from_videos_dir(download_dir)

# Convert videos to images, sample 1 frame per second
frame_view = dataset.to_frames(sample_frames=True, fps=1)

import fiftyone.brain as fob

# Index images by similarity
results = fob.compute_similarity(frame_view, brain_key="frame_sim")

This used to work. But now I get:


Traceback (most recent call last):
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fiftyone/utils/youtube.py", line 327, in _do_download
    _validate_video(pytubefix_video)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fiftyone/utils/youtube.py", line 396, in _validate_video
    status, messages = pytubefix.extract.playability_status(
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/pytubefix/extract.py", line 106, in playability_status
    status_dict = player_response.get('playabilityStatus', {})
AttributeError: 'str' object has no attribute 'get'```
pseudotensor commented 4 weeks ago

Related bug, that also used to work, is to have different download_dir than local dir. E.g. this fails:

import os

urls = ["https://www.youtube.com/shorts/fRkZCriQQNU"]
download_dir = './foo'
os.makedirs(download_dir, exist_ok=True)

if urls:
    import fiftyone.utils.youtube as fouy
    fouy.download_youtube_videos(urls, download_dir=download_dir)

# Create a FiftyOne Dataset
import fiftyone as fo
dataset = fo.Dataset.from_videos_dir(download_dir)

# Convert videos to images, sample 1 frame per second
frame_view = dataset.to_frames(sample_frames=True, fps=1)

import fiftyone.brain as fob

# Index images by similarity
results = fob.compute_similarity(frame_view, brain_key="frame_sim")

fails with:

Traceback (most recent call last):
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fiftyone/utils/youtube.py", line 327, in _do_download
    _validate_video(pytubefix_video)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fiftyone/utils/youtube.py", line 396, in _validate_video
    status, messages = pytubefix.extract.playability_status(
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/pytubefix/extract.py", line 106, in playability_status
    status_dict = player_response.get('playabilityStatus', {})
AttributeError: 'str' object has no attribute 'get'
 100% |████████████| 1/1 [862.9ms elapsed, 0s remaining, 1.2 videos/s] 
 100% |███████████| 0/0 [357.5us elapsed, ? remaining, ? samples/s] 
Traceback (most recent call last):
  File "/home/jon/h2ogpt/bob235.py", line 16, in <module>
    frame_view = dataset.to_frames(sample_frames=True, fps=1)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fiftyone/core/collections.py", line 7223, in to_frames
    return self._add_view_stage(fos.ToFrames(**kwargs))
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fiftyone/core/dataset.py", line 6698, in _add_view_stage
    return self.view().add_stage(stage)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fiftyone/core/collections.py", line 3999, in add_stage
    return self._add_view_stage(stage)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fiftyone/core/view.py", line 1715, in _add_view_stage
    view = stage.load_view(self)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fiftyone/core/stages.py", line 8113, in load_view
    frames_dataset = fovi.make_frames_dataset(
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fiftyone/core/video.py", line 615, in make_frames_dataset
    fova.validate_video_collection(sample_collection)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fiftyone/core/validation.py", line 142, in validate_video_collection
    raise ValueError(
ValueError: Expected collection to have media type video; found None
JuanBindez commented 4 weeks ago

hello, this is not a bug

pseudotensor commented 4 weeks ago

Ok, it used to work with pytubefix 6.2.2 and now fails.

JuanBindez commented 4 weeks ago

see here status_dict = player_response.get('playabilityStatus', {}) it expects a dictionary but is trying a str this is blocking the error