castorini / honk

PyTorch implementations of neural network models for keyword spotting
http://honk.ai/
MIT License
509 stars 124 forks source link

Upgrade PyTube to PyTube3 for Keyword Data Generator #101

Open hbmartin opened 4 years ago

hbmartin commented 4 years ago

Hello, opening this issue to let you know that there is a new, actively maintained, Python3 only pytube fork: https://github.com/hbmartin/pytube3

daemon commented 4 years ago

Thanks for letting us know.

SuperKogito commented 4 years ago

Just to follow-up on #104 : Downgrading to pytube==9.5.6 though solved some issues but it created more complex ones related to changes only pytube==9.6.0 addresses.

hbmartin commented 4 years ago

@SuperKogito what issues are you referring to?

SuperKogito commented 4 years ago

I am not sure if my reporting can be accurate on this, since I made some changes on the code to avoid using the Words API but here is the ull trace-back related to pytube:

Traceback (most recent call last):
  File "keyword_data_generator.py", line 283, in <module>
    main()
  File "keyword_data_generator.py", line 274, in main
    args.output_dir)
  File "keyword_data_generator.py", line 205, in generate_dataset
    srt_captions = retrieve_captions(url, keyword)
  File "keyword_data_generator.py", line 83, in retrieve_captions
    video = PyTube(yp.get_youtube_url(url))
  File "/home/kogito/anaconda3/envs/py37/lib/python3.7/site-packages/pytube/__main__.py", line 88, in __init__
    self.prefetch_init()
  File "/home/kogito/anaconda3/envs/py37/lib/python3.7/site-packages/pytube/__main__.py", line 97, in prefetch_init
    self.init()
  File "/home/kogito/anaconda3/envs/py37/lib/python3.7/site-packages/pytube/__main__.py", line 130, in init
    mixins.apply_descrambler(self.player_config_args, fmt)
  File "/home/kogito/anaconda3/envs/py37/lib/python3.7/site-packages/pytube/mixins.py", line 89, in apply_descrambler
    for i in stream_data[key].split(',')
KeyError: 'url_encoded_fmt_stream_map'

This issue is thoroughly discussed here pytube/issues#467

hbmartin commented 4 years ago

Ah ok, this is completely addressed in pytube3