psf / requests

A simple, yet elegant, HTTP library.
https://requests.readthedocs.io/en/latest/
Apache License 2.0
52.16k stars 9.33k forks source link

request.files is empty after POSTing a file #2505

Closed bepetersn closed 9 years ago

bepetersn commented 9 years ago

Hey guys, I don't know if this is the right place to make this bug report, but I could sure use some help, I have been banging my head against a wall for a bit. See this repo where I'm reproducing my error: https://github.com/bepetersn/special-repo.

Using Flask, I'm seeing weird behavior around its request object. After uploading a file with the requests library, e.g. requests.post(uri, files=<my_files>), by the time the request propagated to my view function, request.files was empty. Oddly, I observed that the contents of the file itself was available under request.form[None].

After quite a lot of debugging, I saw that Werkzeug received roughly the following as the form/multipart-encoded data, and proceeded to try to parse it:

Content-Disposition: form-data; name=utf-8\'\'Spirit%20Airlines%20-%20cheap%20tickets%2C%0A %20cheap%20flights%2C%20discount%20airfare%2C%20cheap%20hotels %2C%20cheap%20car%20rentals%2C%20cheap%20travel.pdf; filename=utf- 8\'\'Spirit%20Airlines%20-%20cheap%20tickets%2C%0A%20cheap%20flights %2C%20discount%20airfare%2C%20cheap%20hotels%2C%20cheap%20car%20rentals %2C%20cheap%20travel.pdf\r\n

Do you notice the little "*" characters just after the "name" and "filename" header definitions? From my sense of things, Werkzeug looks for exactly "name" and "filename", doesn't know how to parse these in their place, fails to find an attachment name, and thus doesn't create a werkzeug.datastructures.FileStorage. For this reason, request.files is empty.

What does anyone think? Is requests responsible for adding these extra characters in?

bepetersn commented 9 years ago

I see the line responsible here: https://github.com/kennethreitz/requests/blob/35d083e1665beff39aabe47a79cd1f867b897b0c/requests/packages/urllib3/fields.py#L45. However, it's pretty clear that it's intentional. I'm curious if anyone can point me to what purpose it has (maybe something to do with RFC 2231?), -- and whether it's really correct, if I'm somehow giving requests bad data, or even if werkzeug isn't handling this stuff correctly..

Lukasa commented 9 years ago

This problem is almost certainly because you're passing a unicode string into requests 'files' parameter. Make sure your strings are all bytestrings first. =)

bepetersn commented 9 years ago

I will try that. :) Beats reading about 4 RFCs to check if I should actually be wanting to submit a pull request to werkzeug.

bepetersn commented 9 years ago

Well, I don't fully get the logic of it, but what I actually needed to do was get rid of the newline that was in the filename I was passing to the files parameter of requests.post.

Lukasa commented 9 years ago

Ah, yes, that'll do it to.

The problem here is that the multipart/form-encoded spec doesn't ordinarily allow newlines in the form fields, so you need to do some awkward work to get around them. That involves the encoding we've used, which is sadly relatively poorly supported by many servers. =(

bepetersn commented 9 years ago

Thanks for your help, Corey. On Mar 18, 2015 4:54 PM, "Cory Benfield" notifications@github.com wrote:

Closed #2505 https://github.com/kennethreitz/requests/issues/2505.

— Reply to this email directly or view it on GitHub https://github.com/kennethreitz/requests/issues/2505#event-258690327.