kevin1024 / vcrpy

Automatically mock your HTTP interactions to simplify and speed up testing
MIT License
2.71k stars 387 forks source link

VCR.py should be OAuth-timestamp / nonce agnostic #30

Closed fatuhoku closed 11 years ago

fatuhoku commented 11 years ago

This issue is similar to #28 in spirit: that one was about a changing boundary string, this one is about data within the request generally that changes each time we invoke it: especially request parameters.

Nonces in particular are designed exactly to prevent replay attacks from occurring so unsurprisingly this is potentially a problem for VCR.py. They're usually just passed to the server as a parameter.

Using the Flickr API as an example. Three request parameters necessarily change with every request:

These fields may be found in the request body (vcr.Request.body) (in the case of multipart/formdata or query string (vcr.Request.path). Because both of these are included in the hash of the Request, these changing values cause cache misses and VCR fails retrieve pre-canned responses from cassettes.

Just to give you an idea, requests look a bit like this:

Accept-Encoding: gzip, deflate, compress
User-Agent: python-requests/1.2.3 CPython/2.7.5 Darwin/12.4.0
Content-Length: 40093
Accept: */*
Connection: close
Content-Type: multipart/form-data; boundary=5094bb8eab0e4eb3a0188ee90f9a2f4c
Host: SOME_URL

--5094bb8eab0e4eb3a0188ee90f9a2f4c
Content-Disposition: form-data; name="oauth_nonce"

36636389
--5094bb8eab0e4eb3a0188ee90f9a2f4c
Content-Disposition: form-data; name="oauth_timestamp"

1379361625
...
--5094bb8eab0e4eb3a0188ee90f9a2f4c
Content-Disposition: form-data; name="oauth_signature"

P/0/vYDCD/ztPNUWF07wDcRqODI=

Solution:

The only way for VCR to still work in the light of these pesky authenticated requests is to omit these fields when calculating the hash or identity of the request. The hash or identity should be agnostic to a user-defined set of changing fields such as { oauth_signature, oauth_timestamp, oauth_nonce }.

This should be configurable directly from the VCR.py API in the with cassette() block or something.

It looks like VCR will have to parse the request then purposefully omit the request parameters (whether in query string or in body) before forming a new signature-less request to hash with.

kevin1024 commented 11 years ago

I would like to add support for custom request matchers. This would allow you to register a function that takes two Request objects, and returns True if they match, False if they don't. I think this would solve this issue.

I have an API already for registering custom serializers, I would like to use the same registration mechanism.

kevin1024 commented 11 years ago

Actually, providing a mode where it just matches on request path + method, and ignoring body contents for matching purposes, might solve this problem as well?

fatuhoku commented 11 years ago

Regarding your first message:

Aye, that's a great idea: it throws the ball into the user's court. In my case, I would still need to parse the request and strip out the parameters.

About request path + method: yes, this is actually fine. This feels like we're moving towards HTTPretty — VCR. TBH I'd love to see these two projects eventually converge... haha. HTTPretty with recording/playback features would be great (EDIT: ah, well, I just saw this. Always ahead of the game, aren't you!)

So yes; when recording, record everything but when looking up responses just match on request path and method.

How would you implement this?

kevin1024 commented 11 years ago

Just implemented it, check out the staging branch and let me know if that works for you :)

kevin1024 commented 11 years ago

I went ahead and released the new request-matching framework, I think this should solve this issue.

fatuhoku commented 11 years ago

I will give it a shot... that was very quick!