einstein95 / crunchy-xml-decoder

GNU General Public License v2.0
35 stars 14 forks source link

Some videos can't be downloaded because they are using HLS (m3u8 file) #70

Open Berni11 opened 8 years ago

Berni11 commented 8 years ago

It seems that some videos are using HLS (m3u8 file) instead of rtmp which results in an error when trying to download them.

I had this problem with these two videos: http://www.crunchyroll.com/jojos-bizarre-adventure/episode-13-wheel-of-fortune-652601 http://www.crunchyroll.com/jojos-bizarre-adventure/episode-1-part-1-phantom-blood-653409

iCertys commented 8 years ago

I think this can only be solved with something like FFmpeg. (Has the best quality when converting from m3u8 to mp4 and mkvmerge should support mp4 too.)

Do you know how to get the m3u8 link by using the media_id? Can't find the right function or API-Link. (I 've searched the swfPlayer code, but can't find where the video_id is retrieved.)

Sorry for my bad English.

iCertys commented 8 years ago

Strange, I wanted to look at the code for the HLS streams again but today the Player has not loaded the HLS Plugin http://static.ak.crunchyroll.com/versioned_assets/OSMFHLSPlugin.89701c06.swf Instead it used the default rtmp Stream. (I found 26 HLS-Streams yesterday, but today non) Perhaps Crunchyroll is testing a new hls streaming server. Maybe a new player without Flash?? (HLS Server URL: http://serve.cxcdn.net)

jsonn commented 8 years ago

A proof of concept patch is below. I have a lot of local changes in my version, so it might not apply cleanly. It will also need m3u8 from pypi and py-Crypto.

crunchy-xml-decoder-xml.diff.txt

einstein95 commented 8 years ago

I'm in the process of rewriting from the ground up, but it seems like the m3u8 urls can be found using the mobile api. Also had luck with bruteforcing video ids.

jsonn commented 8 years ago

The patch above works, I'm still fine tuning the exception handling since the HLS likes to kill the connections whenever it believes the client is not fast enough.

iCertys commented 8 years ago

Hi einstein95, What do you mean by bruteforcing video ids?

I checkt the code of the hls flash plugin and the Player sends Postrequest to

http://www.crunchyroll.com/xml/?req=RpcApiVideoPlayer_GetStandardConfig&media_id=652601&video_format=103&video_quality=80&auto_play=1&aff=crunchyroll-website&show_pop_out_controls=1&pop_out_disable_message=&click_through=0

Body: current%5Fpage=http%3A%2F%2Fwww%2Ecrunchyroll%2Ecom%2Fjojos%2Dbizarre%2Dadventure%2Fepisode%2D13%2Dwheel%2Dof%2Dfortune%2D652601

Cookie: Host:www.crunchyroll.com', 'Connection: keep-alive', 'Content-Length: 128', 'Origin:http://static.ak.crunchyroll.com', 'X-Requested-With:ShockwaveFlash/22.0.0.209', 'User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML', 'like Gecko) Chrome/52.0.2743.116 Safari/537.36', 'Content-Type:application/x-www-form-urlencoded', 'Accept:/', 'Referer:http://static.ak.crunchyroll.com/versioned_assets/StandardVideoPlayer.f3770232.swf', 'deflate', 'Accept-Language:de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4', 'Cookie:cfduid=??; qca=??; _ig=??; _ig_nump=??; _ig_sess=??; _igur=-1; c_locale=deDE; c_visitor=??; position=1; userid=??; c_userid=??; c_userkey=??; c_d=p%3D1; sess_id=??; _ga=GA1.2.1831795990.1459009773; _gat=1; ki_t=??; ki_r=; __ar_v4=??

which returns an xml with the m3u8 in it. (jaJP.m3u8) and Inside the .m3u8 is the link to the Computer stream (stream.m3u8)

Sorry for my bad english

iCertys commented 8 years ago

Also this request works with hls and rtmpe

hls stream link is in:

rtmpe://cp150757.edgefcs.net/ondemand/?auth=daEaacsaGaobgaIbWdQdWc5bQd6dMcxdNbZ-bxUgqx-dHa-lCLwnqLBDuy&aifp=0009&slist=c20/s/ve2307557/video.mp4

And hls stream in:

http://serve.cxcdn.net/s/v/9vuxan8yjjaks8f/m/8d969e75710f3008f6d529581a5e9dc0/jaJP.m3u8?v=4a6e59e8fe831dcbf91a170acd0092e9&k=aGtnK2VrUnV0K21sZXVIaURjdmNWTzZWRzdzPV97ImEiOiI5MSw2LGphSlAsIiwiYyI6MTQ0MDI2ODkwNiwiZCI6ImNyYW5pbWUiLCJnIjoiWloiLCJoIjoiOXZ1eGFuOHlqamFrczhmIiwibCI6NzIwMCwicCI6IjEiLCJyIjoiYzMwZDgyIiwicyI6MTgwMjA5LCJ0IjoxNDcxMjc4NTc0LCJ2IjozfQ
iCertys commented 8 years ago

`

rtmpe://cp150757.edgefcs.net/ondemand/?auth=daEaacsaGaobgaIbWdQdWc5bQd6dMcxdNbZ-bxUgqx-dHa-lCLwnqLBDuy&aifp=0009&slist=c20/s/ve2307557/video.mp4

`

iCertys commented 8 years ago

`

http://serve.cxcdn.net/s/v/9vuxan8yjjaks8f/m/8d969e75710f3008f6d529581a5e9dc0/jaJP.m3u8?v=4a6e59e8fe831dcbf91a170acd0092e9&k=aGtnK2VrUnV0K21sZXVIaURjdmNWTzZWRzdzPV97ImEiOiI5MSw2LGphSlAsIiwiYyI6MTQ0MDI2ODkwNiwiZCI6ImNyYW5pbWUiLCJnIjoiWloiLCJoIjoiOXZ1eGFuOHlqamFrczhmIiwibCI6NzIwMCwicCI6IjEiLCJyIjoiYzMwZDgyIiwicyI6MTgwMjA5LCJ0IjoxNDcxMjc4NTc0LCJ2IjozfQ`
iCertys commented 8 years ago

Sorry I forgot "Insert code"

einstein95 commented 8 years ago

What do you mean by bruteforcing video ids?

for i in {710000..715151}; do curl -s 'http://www.crunchyroll.com/xml/?req=RpcApiVideoPlayer_GetStandardConfig&media_id='$i --data 'current%5Fpage=http%3A%2F%2Fwww%2Ecrunchyroll%2Ecom' -H 'Cookie: sess_id=; c_userid=; c_userkey=' | grep -P '<media_type>(\d)</media_type>' && echo $i; done

jsonn commented 8 years ago

The current meta data parsing seems to work perfectly fine? You don't get a host key, but a file entry. If you follow the file link, you arrive at the m3u8 list.

jsonn commented 8 years ago

Or do you not start from the video URL?

iCertys commented 8 years ago

What dose the Inforation ?? mean?? Becouse by me Its 1 on hls and on rtmpe

Can you tell me a link where media_type is something different then 1 ?

einstein95 commented 8 years ago

@iCertys http://www.crunchyroll.com/xml/?req=RpcApiVideoPlayer_GetStandardConfig&media_id=710001

iCertys commented 8 years ago

Yes but when I check the id=710001 there is no stream behind this And if I check it in the browser it redirect me to 680655 In which country did you tested this id?

einstein95 commented 8 years ago

What do you mean by "there is no stream behind this"?

iCertys commented 8 years ago

If I send a Postrequest to this ID I get a 500 Internal server error And If I Open the Id With the Browser I get redirect to 680655 And If I Open the API link whithout Postrequest I get the Link http://www.crunchyroll.com/media-710001/-unbekannt

-unbekannt is German and means unknownbut I use a US Proxy from California (I dont know why it is German written in German)

That is why I asked In which country you tested this becouse not all steams ar accessabel in every country.

I tested California and German

einstein95 commented 8 years ago

If you get redirected to 680655 and that video loads, then it should work.

iCertys commented 8 years ago

Yes but Im searching for a video where the media_type is not 1 And on the redirected video it is 1 Thats my proplem I cant find a video where it is not 1 ;(

einstein95 commented 8 years ago

If you use the media ID 710001, the resulting GetStandardConfig gives media_type 4.

iCertys commented 8 years ago

Ok Interesting but i get still error 500. Anyway I do not want to waste all your time. Thank you.

rs3mk commented 8 years ago

Work

sin titulo 8

add

sin titulo 9

I have no idea why it works.

rcyclope commented 8 years ago

i don't know why but it works

ObiWanTwo commented 8 years ago

To begin with, the script stops running because the host variable is set to None, the host tag is actually empty in the xml file when it comes to HLS. By calling host.string you're pushing the script to commit an AttributeError. If the host variable is set to None, then you can't get its attributes which is logic. Knowing that this kind of exception is handled by the script, you'll execute the code that comes right after except AttributeError:. You don't have to print it actually, you just have to call it, anything else would work.

insanedude63 commented 8 years ago

Just a suggestion but would you be able to base your new code off of the one in youtube-dl? here This could help with videos but not the manga. While at restructuring the code it would be nice to integrate the pip installer as well.

jsonn commented 8 years ago

@insanedude63 The code should be pretty much self-contained. It only cares about the file property from the video meta data.

jsonn commented 8 years ago

Please test #78.

VladimiPutin commented 8 years ago

I have add line print host.string and got this error: image

Pikanet128 commented 8 years ago

@rs3mk thank you thank you that fixed it im acually using an old ver of the toolkit that i modified for my own purposes to get the ep title and desc, auto login, url dl que, etc.

so i did not what to have to start over with the new ver well your fix fixed it thank you

@adriannx is the tabs lined up try:\n\thost = xmlconfig.find('host').string\n\tprint host.string are they spaces instead of tabs they must all be one or the other