gsteph / groove-dl

A Grooveshark song downloader in Python
http://gsteph.github.io/groove-dl/
120 stars 30 forks source link

Outdated client #8

Closed DelusionalLogic closed 12 years ago

DelusionalLogic commented 12 years ago

The grooveshark servers return with an outdated client message, and doesn't return the result. updating the client value breaks the token.

gsteph commented 12 years ago

Looking into it.

DelusionalLogic commented 12 years ago

awesome, i just used 2 hours looking through my code because this happened, it worked like 3 hours ago.

Thank you for looking into it.

EDIT: it should be noted that almost everything works, the only part that is broken is getStreamKeyFromSongIDEx

DelusionalLogic commented 12 years ago

steps:

  1. Remove the check for args (i don't use the gui)
  2. Add "raw_input("Queuery: ")" instead of sys.argv[1]
  3. run the code and search for song
  4. pick a song

this yields Traceback (most recent call last): File "C:\Users\Delusional Logic\Eclipse\grooveShark\main.py", line 136, in s = 'wget --user-agent="%s" --referer=%s --post-data=streamKey=%s -O "%s - %s.mp3" "http://%s/stream.php"' % (_useragent, _referer, stream["streamKey"], s[songid]["ArtistName"], s[songid]["SongName"], stream["ip"]) TypeError: list indices must be integers, not str

Adding a print right after setting variable stream yields {u'header': {u'session': u'c568619c187affff10c3aa8b0d00be1b', u'serviceVersion': u'20100903', u'prefetchEnabled': True, u'expiredClient': True}, u'result': {u'24340085': []}}

gsteph commented 12 years ago

I think the problem's in your code. Even my CLI client works fine. Could you paste the part where you input the query, search and request the key ?

DelusionalLogic commented 12 years ago

it's a much worse problem it would seem, i might have been flagged by grooveshark. their webplayer is non-functioning too. while it's fine if i use a proxy

EDIT: Just contacted grooveshark support, ill let you know what happens.

gsteph commented 12 years ago

That's actually it. My client stops working after some consecutive downloads. I think I might be able to fake the traffic better than this but I haven't had the time lately :/

DelusionalLogic commented 12 years ago

Ahh crud, does yours go back up again? sadly life takes up so much damn time.

gsteph commented 12 years ago

yeah it does. I've no idea in how long though. Just change your IP now if you can (or keep using that proxy).

DelusionalLogic commented 12 years ago

my ip will reset at some point, no worries, ill just have to reboot the modem. thank you tho.

DelusionalLogic commented 12 years ago

Quick update, i got done with the support, apparently they did not issue an ip ban on my ip. but complaining got me a free month of grooveshark anywhere, so that's good.

gsteph commented 12 years ago

That's weird o.O. Are you a premium user ?

DelusionalLogic commented 12 years ago

nope, i had grooveshark plus for a month once, but that was a year ago, contacting the support they just told me "hey, yeah i see no ban on your ip, anyway, here's a free month of anywhere"

BTW, i think it broke, but this time i think it's the code, because grooveshark works fine (httplib.BadStatusLine: '' in getToken() i've tried to fix it but to no avail.

gsteph commented 12 years ago

That problem happens here too. So far I can't fix it, been trying for a lot of time. Really not sure what's wrong. Nothing changed in their protocol.

DelusionalLogic commented 12 years ago

Good, at least now i know that there is a problem i can fix with the code, and not with my network, i'll be sure to file a pull request if i get it fixed.

It seems like the browser already has a token encoded in the json when it send the getCommunicationToken, how can that be... hmm.

gsteph commented 12 years ago

Thanks. I'm trying to fix it myself.

And yes! I found that out too, but since I couldn't figure out how that's possible (I completely debugged the JS with firebug) I switched to html5.grooveshark.com. Uses the same API except they make more sense (no token in getCommunicationToken). I could completely replicate what it does in code, but I STILL get a 'BadStatusLine' (just in case you don't know already, that means the server returned nothing).

DelusionalLogic commented 12 years ago

Yeah, i took a stab in the dark and guessed that BadStatusLine is probably just a timeout, aka the server is just letting us hang.

I have a theory tho. it seems that the JSONEncoder is encoding in a "wierd" way, with the header first. grooveshark may have upped the anti, and made the service more strict. i am trying to encode the json manually right now.

UPDATE: that didn't fix it, i'm fresh out of ideas, tried chrome debug tools, firebug and wireshark (wireshark being useless at ssl stuff). I learned that firebug can decode ssl while chrome debug can't but that's about it.

gsteph commented 12 years ago

Hmm ? They don't need to decrypt SSL since they're browser-based. I've used them both and found absolutely nothing to change. It's pretty weird.... I don't have any ideas either but I'm thinking.

DelusionalLogic commented 12 years ago

I've been looking through the flash of grooveshark, and it seems like there is a "revisionToken" (or revtoken if you will) of reallyHotSauce whatever that means

gsteph commented 12 years ago

That's the secret key they hash their token with. I'm past that though since the token is not our current problem. As I said, html5.grooveshark.com works without a ready token, but I replicate exactly what it does and I still receive nothing. :/ I'm currently checking if it retrieves any of the JSON vars from any other source by checking all the entries before getCommunicationToken in the Net tab in chrome/firebug. They're not many with html5.grooveshark.com

DelusionalLogic commented 12 years ago

it must be in the way they do it, something must be off... decompiled flash is a nightmare for a c# dev though, and the way they reference js all of a sudden, it's hurting my brain.

DelusionalLogic commented 12 years ago

After a shitload of decompiling, looking and being confused i finally accepted that it might just be a quirk. Guess what, it is.

If you use chrome (or just have chrome) get the extension "REST console" i tried, i pasted in the json that the program spits out (i added a print that gives me json.JSONEncoder.encode(post) result), posted that to grooveshark with the default headers and grooveshark gave me a result back.

Grooveshark didn't change anything, something arbitrary broke the program.

Attached are the headers the program used: Content-Type: application/json Accept: / Connection: keep-alive Origin: chrome-extension: //rest-console-id User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.56 Safari/536.5

EDIT: Wait Wait Wait, hold my horses. Maybe, just maybe, the problem is Connection: keep-alive, from what i know httplib doesn't support keep-alive.

This might not be to break bots, but simply to reduce server problems (due to the server closing the connection all the time) they might have removed the ability of the servers to accept Connection: close requests

I'm just thinking out loud though.

gsteph commented 12 years ago

Great find! Here it sends me 20 bytes of garbage gzip'd data (garbage here being 'indecompressable' (wireshark said so)) but at least it sends something! Grooveshark must have changed something. My scripts wouldn't just stop working suddenly for no reason. Still need to figure out how to make it send back something in python.

DelusionalLogic commented 12 years ago

I just posted on StackOverflow, looking for a solution, if you want to read my efforts, or just see what people reply look here: http://stackoverflow.com/questions/11059229/grooveshark-closing-connection-early-httplib-badstatusline (maybe you can comment with your findings)

gsteph commented 12 years ago

Thanks for that. I hope you don't get rated down. What we're doing here may be deemed 'unethical'... So far what I realize is: everything works fine in HTTP, nothing at all works with HTTPS. That is: whatEVER you may think of sending to grooveshark.com:443 will just return a BadStatusLine. I'll get this posted on your question.

DelusionalLogic commented 12 years ago

Well i might get downvoted, but the internet was created by hackers, but honestly, this is that makes programming fun.

Sadly grooveshark return https required if you don't send it to 443. maybe they changed their port? What confuses me is that the REST Console does just fine. even without faking any headers. we might have hit a limitation of python?

gsteph commented 12 years ago

I also doubt you'll get anything useful. No one would know why this would happen except Grooveshark themselves. Can't get it posted on your question (as a comment). Not enough rep :/

gsteph commented 12 years ago

Haha that's pretty true :D

gsteph commented 12 years ago

Here's something interesting: Groovedown works. It uses 'mobile' shark though. And we of course can't know how it works because there's SSL.

DelusionalLogic commented 12 years ago

Well i know that some grooveshark devs are using stackoverflow, i highly doubt they want to help us though. Who knows, nothing gets programmers excited as a good mystery.

gsteph commented 12 years ago

They might come and make fun of us xD

DelusionalLogic commented 12 years ago

They might, would be pretty funny. I'll append your discovery of no SSL working at the end of the question if you are okay with it.

gsteph commented 12 years ago

Sure.

gsteph commented 12 years ago

Yep, it makes me think it's Python too :/ still wouldn't explain why the thing would just suddenly fail without any changes to the code. It was working great. I might start writing some C soon.

DelusionalLogic commented 12 years ago

I never saw your post about groovedown, so sorry for ignoring that ;), but yes, scilors grooveshark downloader works too. it uses the full version. and it's written in C#

if you know c# the dll isn't obfuscated, so have a look.

I have no idea about C, sadly, that was one of the languages i just skipped. C# seems so much nicer with all higher level features.

gsteph commented 12 years ago

I didn't need C at all. Prepare to be outraged. A simple curl -H "Content-Type: text/plain" -d "@jsontest" "https://grooveshark.com/more.php?getCommunicationToken" -v on a linux box got me a token... jsontest here being {"header":{"client":"mobileshark","clientRevision":"20120227","privacy":0,"country":{"ID":63,"CC1":4611686018427388000,"CC2":0,"CC3":0,"CC4":0,"DMA":0,"IPR":0},"uuid":"BF5D03EE-91BB-40C9-BE7B-11FD43CAF0F0","session":"1d9989644c5eba85958d675b421fb0ac"},"method":"getCommunicationToken","parameters":{"secretKey":"230147db390cf31fc3b8008e85f8a7f1"}}

Even when the json is not syntactically correct, it always returns at least some headers! It's been Python all along...

DelusionalLogic commented 12 years ago

i hate you python D:

I still wonder why this happens.

gsteph commented 12 years ago

Update your question with that, maybe someone will figure out why Python couldn't handle the SSL here. Also you could try keeping the Grooveshark references and research as low as possible generalizing this as a Python issue. That should attract some.

DelusionalLogic commented 12 years ago

It has been updated. i might post a new question completely unrelated to grooveshark tomorrow, asking why python does this. but living in Denmark the time is 2:40 in the night here now. so i think ill have to call that a day. i'll give you a link when i get around to posting it.

Then again, you could post it.

gsteph commented 12 years ago

Heh exact same time here! I think I'll mail the python mailing list directly. Turns out the same code works on linux. This is a 'Python for windows' problem. I'll research more and decide. Good night!

DelusionalLogic commented 12 years ago

Good night!

gsteph commented 12 years ago

aaand here's the issue: http://bugs.python.org/issue15082 Let me know if I missed something.

DelusionalLogic commented 12 years ago

Well, at least we know what the problem is now. Sadly i can't get OpenSSL to compile so i can't compile a python version with OpenSSL 1.0.1

gsteph commented 12 years ago

I'm almost done building a 1.0.1 _ssl.pyd. I just need a way to force VS2010 to build against MSVCR90.DLL instead of MSVCR100.DLL because that's what Python uses... So far I've found nothing.

DelusionalLogic commented 12 years ago

As always, stack overflow provides a sad, yet true answer:

http://stackoverflow.com/questions/4679414/compiling-python-extensions-with-vc2010

I tried to compile all of python in VS2010, couldn't do it. then again, never done it before.

then again, the internet has you covered, somewhere.

gsteph commented 12 years ago

Uh there must be some way.. :/ _ssl.pyd so far compiles btw, it just has that annoying VCR100 dependency.

Then again I could just link statically, but it's just not the right thing to do here..

Edit: I'll try to use MinGW if I have it lying around.

DelusionalLogic commented 12 years ago

well it's not right, and it might cause some problems.

Couldn't the internet, you know, hook you up with VC2008, i don't know how github likes it, so i won't link it, but it's there.

gsteph commented 12 years ago

well sure, I won't even need a 'hook up'. The express editions are available. Thing is, my internet connection is pretty slow and I don't think I have enough space (been postponing organizing my disks for a while now). This would take forever. Nope, don't have MinGW. I think I'll have to do it the VC2008 way.. might be small.

gsteph commented 12 years ago

90 MB.. perfect :D

DelusionalLogic commented 12 years ago

Cool :D, can the express edition actually compile stuff this complicated, it's usually not supported due to the limitations.

gsteph commented 12 years ago

Python says it's supported. Quote from PCBuild/Readme.txt "This directory is used to build Python for Win32 and x64 platforms, e.g. Windows 2000, XP, Vista and Windows Server 2008. In order to build 32-bit debug and release executables, Microsoft Visual C++ 2008 Express Edition is required at the very least."