Closed bboe closed 9 years ago
So there's going to be a severe layers violation here if we try to do that. The preparation of that multipart body happens well before the betamax adapter sees it. That stuff happens before we even get the data.
There's no way we as a library can do this and I'm not certain what the best fix is. As I see it, we have two options:
MultipartEncoder
which allows users to stream large uploads & to select their own boundary.MultipartDecoder
to parse the body and do matching that way.Personally I'm leaning towards 2. Thoughts @bboe?
Note: Currently the MultipartDecoder is designed to work with responses, but it can just as easily work with request objects.
I suppose #2 is the way to go if we cannot change the multipart boundary. My other option would be to get requests to accept specifying the multipart boundary (urlib3 does, but not requests) so that I can set it static (there's not reason to change the boundary per request).
I started down this former path (not actual decoding though) and ran into another set of encoding errors between versions of python as the request in memory is in binary format.
My other option would be to get requests to accept specifying the multipart boundary (urlib3 does, but not requests) so that I can set it static (there's not reason to change the boundary per request).
So this isn't something people frequently ask for. The "blessed" way of doing that is with the MultipartEncoder
from the toolbelt. I think @lukasa would agree that there isn't a good API that we can think of that would provide that functionality in requests.
I started down this former path (not actual decoding though) and ran into another set of encoding errors between versions of python as the request in memory is in binary format.
You mean the request stored by Betamax or something else? I suspect that the MultipartDecoder will be able to handle this. I haven't used it extensively, but it appeared to be very robust and well built.
Since this seems amenable, I'll prioritize this. I'm running the PSF's elections next week so I may not be able to attend to this until after I kick those off. By the @bboe, if you're not already a contributing member of the PSF, you should become one. It's free!
@bboe I started working on this in sigmavirus24/betamax_matchers@6155015
Awesome! Thanks @sigmavirus24
So I have a question for you @bboe:
If you take a look at sigmavirus24/betamax_matchers@2199dea, you'll see that a first implementation of this makes it fairly... naïve. Can you point me to how you're using files=
in requests? I think I'm going to make it sort the parts anyway, but that will have to happen by each part's headers which should be the same for parts in semantically equivalent multipart/form-data
requests.
I haven't tested any of this, but if you can provide early feedback that'd be awesome.
It's a little complicated but here goes.
The following are the tests that I would like to run. They are commented out currently because they are not replayable. Also the test methods should be decorated with betamax
, rather than calling betamax_init
inside the tests:
https://github.com/praw-dev/praw/blob/master/tests/test_images.py#L86
PRAW uses files
by preparing a request associated with a dummy session:
https://github.com/praw-dev/praw/blob/f85fbba050d08bc1b0102979a9b3c91805878a79/praw/internal.py#L160
This prepared request can be used in one of two ways. Most commonly it is sent to a rate limit handler that ensures requests are only made within the API's time limits: https://github.com/praw-dev/praw/blob/f85fbba050d08bc1b0102979a9b3c91805878a79/praw/handlers.py#L101
The second way is to permit distributed rate limit handling. In that case, the prepared request is serialized and transmitted to another process that handles the request:
I hope that helps some. I really appreciate you looking into it. I should have some time this weekend to be of any assistance.
So looking at the upload_image
handler I think I should be able to solve your problem by sorting the parts. I'll take a much closer look tonight though to be safe.
Oh and speaking of a betamax decorator, we've had a similar request here: https://github.com/sigmavirus24/betamax/issues/61
To make your testing easier @bboe I have released betamax_matchers
0.2.0 (https://github.com/sigmavirus24/betamax_matchers#multipart-form-data-body-matcher). Please file bugs there if you find problems with the matcher.
:beer: Cheers!
Thank you! I will look at it this weekend. I appreciate your efforts :+1:
When specifying files via
files=[...]
requests results in choosing a differentmultipart_boundary
each time, thus making it quite difficult to match. There also appear to be some issues with cassettes and binary data for body matching.Perhaps when using the betamax adapter, the
multipart_boundary
can be fixed to a constant value rather than auuid
that requests uses.