colonelpanic8 / okcupyd

A Library that enables programmatic interaction with okcupid.com, using okcupid.com's private okcupid JSON API and html scraping when necessary.
MIT License
110 stars 18 forks source link

Some POST requests (/questions/ask and /rate) don't work for some accounts #8

Closed nderkach closed 10 years ago

nderkach commented 10 years ago

Steps to reproduce.

profile = User().quickmatch()
profile.rate(5)

Also, I've noted that profile.refresh() in profile_test.py seems to be redundant as you do a refresh in rate() itself.

Now, the rating seems to work most of the time, so I'm still trying to figure it out.

@IvanMalison could you explain how you calculate rating?

    width_percentage = int(''.join(c for c in rating_style if c.isdigit()))
    return (width_percentage // 100) * 5

Shouldn't it be simply width_percentage // 20? And in case you have widths more than 100, you could do something like width_percentage % 100 // 20

The problem is that with the current code you get a zero with the first division

colonelpanic8 commented 10 years ago

Yeah... That is just silly. The rating is only ever going to work if its 5 -- its going to display 0 otherwise.

https://github.com/IvanMalison/okcupyd/commit/f36806aa44b936f093651ed55a0b8b654b8ad701 should fix the issue. Let me know if it doesn't for you.

nderkach commented 10 years ago

rate() is still not working for me though, investigating it atm

nderkach commented 10 years ago

Also, what's the reason for doing a refresh every time you rate at https://github.com/IvanMalison/okcupyd/blob/master/okcupyd/profile.py#L376 ?

Rating should work without a refresh (as you just send a POST there), thus I think that it makes more sense to do a refresh just in the unittest.

colonelpanic8 commented 10 years ago

Rating does work without the refresh, but the rating cached property will now be incorrect. Adding the refresh there busts the caches of all of the cached properties which makes it so that they will be updated if they are accessed again. It would be fine to do a self.refresh(reload=False), which would delay the retrieval of the profile tree to the point at which it is needed.

colonelpanic8 commented 10 years ago

oh wait... how did you install okcupyd? did you use pip? I didn't put version requirements on any of the packages, but I think that this needs a relatively new version of requests. Have you tried running using tox? what are you using to get an interactive shell?

colonelpanic8 commented 10 years ago

Also if you are running tests, you should be aware that they all use vcrpy so they don't make any actual http requests. How exactly are making the rating call / what version of python.

Can you paste the output of pip freeze in whatever environment you are running in?

nderkach commented 10 years ago

@IvanMalison Got it, but since the only thing in a profile affected by rate() is the actual rating, wouldn't it make sense NOT to do a full profile refresh? Thus, maybe using self.refresh(reload=False) indeed makes more sense here.

colonelpanic8 commented 10 years ago

Well the thing is that if you want to access the rating again, you HAVE to reload the profile because thats how the rating is retrieved. doing self.refresh(reload=False) would still work because the profile would be reloaded when the rating is requested. I think its a good change though. Submit a pr.

Did you figure out why rate wasn't working?

nderkach commented 10 years ago

I use the virtualenv provided with unittests (tox -e interactive). Python 3.4.2rc1, requests 2.4.1

So you are using fake http requests and then have an assert that the rating actually changed? How does that work?

colonelpanic8 commented 10 years ago

I'm using this library called https://github.com/kevin1024/vcrpy that i have been contributing to recently. It will run an actual http interaction if it has never been performed before and record it to a flat file. After that, all outgoing http requests will be intercepted -- the recorded responses will be used instead.

In this case, the assertion is still testing something. You'll notice that if you remove the refresh line, for example, the test will fail. The second request is the one that will have the updated rating.

I've set it up so that it is really easy to rerecord requests in a safe way. There is a tox environment called rerecord that will delete the old cassette and then scrub all pii (username/password) from any requests. It will make live http requests though.

It would be great if you could test the rerecord functionality and make sure it works for you.

colonelpanic8 commented 10 years ago

Is this still an issue for you or can I close this?

nderkach commented 10 years ago

Indeed, when running with rerecord environment, the assertion with rating fails.

    profile = User().quickmatch()
    profile.rate(1)
    >       assert profile.rating == 1
    E       assert 0 == 1
colonelpanic8 commented 10 years ago

can you try running with --enable-logger='requests'

colonelpanic8 commented 10 years ago

or maybe --enable-logger='vcr'

colonelpanic8 commented 10 years ago

you could also throw an import ipdb; ipdb.set_trace() in the call to rating to inspect what the response object looks like. Things that might be worth checking include what the response content/http code is. What the requested path is. I have a sneaking suspciion that the parameters jsut arent making into the request.

nderkach commented 10 years ago

For some reason I keep getting a 302 response when doing a POST

INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): www.okcupid.com send: b'POST /vote_handler HTTP/1.1\r\nHost: www.okcupid.com\r\nContent-Length: 131\r\nConnection: keep-alive\r\nAccept: */*\r\nCookie: secure_login=1; __cfduid=d8bf5792eafcb1808577c51c30ce21f301411865461203; authlink=5ba0c10a; override_session=0\r\nUser-Agent: python-requests/2.4.1 CPython/3.4.2rc1 Darwin/13.4.0\r\nAccept-Encoding: gzip, deflate\r\nContent-Type: application/x-www-form-urlencoded\r\n\r\nvote_type=personality&voterid=3922757959941507571&target_objectid=0&target_userid=5645054601833689048&type=vote&cf=profile2&score=2' reply: 'HTTP/1.1 302\r\n' DEBUG:requests.packages.urllib3.connectionpool:"POST /vote_handler HTTP/1.1" 302 None header: Date header: Transfer-Encoding header: Connection header: X-OKWS-Version header: Location header: P3P header: X-XSS-Protection header: Set-Cookie header: Server header: CF-RAY send: b'GET /vote_handler HTTP/1.1\r\nHost: www.okcupid.com\r\nConnection: keep-alive\r\nAccept: */*\r\nCookie: secure_login=1; guest=29175215940890607; session=3922757959941507571%3a11774008562404935458; __cfduid=d8bf5792eafcb1808577c51c30ce21f301411865461203; authlink=5ba0c10a; override_session=0; secure_check=1\r\nUser-Agent: python-requests/2.4.1 CPython/3.4.2rc1 Darwin/13.4.0\r\nAccept-Encoding: gzip, deflate\r\nContent-Type: application/x-www-form-urlencoded\r\n\r\n' reply: 'HTTP/1.1 200 OK\r\n' DEBUG:requests.packages.urllib3.connectionpool:"GET /vote_handler HTTP/1.1" 200 None header: Server header: Date header: Content-Type header: Transfer-Encoding header: Connection header: Cache-control header: X-OKWS-Version header: P3P header: X-XSS-Protection header: CF-RAY header: Content-Encoding b'{}\r\n' {} 200

colonelpanic8 commented 10 years ago

Hmmm it seems to be an issue with python3 and not with python2. Interesting. I'm taking a look. In the meantime, if you use interactive (with the commit that i just pushed) it should work.

colonelpanic8 commented 10 years ago

also, because the test passes with a recorded python 2 reponse, it means that something is wrong with the way the request is being made in python 3.

colonelpanic8 commented 10 years ago

I've started having this issue, but only with one of my profiles... It does not appear to have anything to do with python 2/3. @nderkach Weren't you saying that you were only having this issue with one of your profiles?

nderkach commented 10 years ago

Yes, I have this issue with my own account, the other test account I have work fine.

PS: @IvanMalison what is a good email to reach you at? The one in your github profile works?

colonelpanic8 commented 10 years ago

yeah that one works well.

colonelpanic8 commented 10 years ago

that is bizzarre. I'm logging a warning when the response doesn't indicate that the rating was successful. I'm running out of ideas about what this could be, because I've looked at the http request that is sent from the browse, and the one that is being sent from rate is practically identical. The only things that could be different are the headers, but how that would explain why it works with some accounts and not others is beyond me. Perhaps we should try throwing all the headers that from the website in to see what happens.

colonelpanic8 commented 10 years ago

This issue appears to extend to answering questions...

colonelpanic8 commented 10 years ago

It seems like this was an protocol issue. For some reason, it is expected that certain requests are merely http NOT https. Why this happens in particular for this set of requests is beyond me. It actually seems pretty strange since the POST requests are probably more sensitive than the GET requests. I'm not 100% sure about this yet, so if you can help me verify @nderkach it would be appreciated.

colonelpanic8 commented 10 years ago

I was correct about the problem, but making those requests unsecure has now broken it for the set of accounts that were previously working. I think we are going to have to look at the session cookies to determine which protocol to use.

colonelpanic8 commented 10 years ago

Things should work for both kinds of accounts with https://github.com/IvanMalison/okcupyd/commit/4e43f90d8a8608e76753924bba6838caef534f5a

How does this secure flag get set and unset for particular accounts?