abhishek-vinjamoori / SubtitleExtractor

This repository is aimed at downloading subtitles from popular Internet Services.
28 stars 13 forks source link

encoding problem #22

Open D00oo00M opened 4 years ago

D00oo00M commented 4 years ago

when i try to downlod from hulu i got this messge: C:\Python\lib\site-packages\bs4__init__.py:203: UserWarning: You provided Unicode markup but also provided a value for from_encoding. Your from_encoding will be ignored. warnings.warn("You provided Unicode markup but also provided a value for from_encoding. Your from_encoding will be ignored.") Unable to get the subtitles. Please try again and open an issue to request for support for this video. Subtitles not downloaded.

someone can help me how to fix it?

kimchiiboiii commented 8 months ago

In case you still wanted to use it, I figured out how to make it work.

I found the way to make it work from here

Specifically this part:

If multiple languages are present we give the user an option to enter their choice. We then convert the SMI URL to a VTT URL as follows - http://assets.huluim.com/captions/380/60601380_US_en_en.smi ---> http://assets.huluim.com/captions_webvtt/380/60601380_US_en_en.vtt Then the subtitles are converted from VTT to SRT format in the standard way.

So you just get the captions.smi link and convert it to VTT format.

If you need instructions:

  1. Go to the video on Hulu and open the developer console

  2. Go to Network and type 'captions' in Filter URLs

findfilter

  1. Get the link from http://assets.huluim.com/captions

getlink

  1. Convert the SMI URL to a VTT URL like in the part I quoted from the docs above. All you gotta do is change '/captions/' --> '/captions_webvtt/' and the file extension to .vtt

  2. Go to that new link and download it and it worked for me.