psf / requests

A simple, yet elegant, HTTP library.
https://requests.readthedocs.io/en/latest/
Apache License 2.0
52.17k stars 9.33k forks source link

python 3.5 requests 2.9.1 can not upload Chinese filename #3046

Closed pc10201 closed 8 years ago

pc10201 commented 8 years ago

source code

`import requests

headers = { 'Origin': 'http://home.ctfile.com', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36', 'Accept': '/', 'Referer': 'http://home.ctfile.com/', 'Accept-Encoding': 'gzip, deflate', 'Accept-Language': 'zh-CN,zh;q=0.8', }

files = {'file': open('中文.txt', 'rb')} values = {'name': r'中文.txt', 'filesize': '101'}

url = 'xxxx' r = requests.post(url, files=files, data=values, headers=headers)

print(r.content)`

I use fiddler to capture browser data.

The right data is: Content-Disposition: form-data; name="file"; filename="中文.txt"

The python script data is Content-Disposition: form-data; name="file"; filename*=中文.txt

so I change

Python35\Lib\site-packages\requests\packages\urllib3\fields.py

line 46-47 to

# value = email.utils.encode_rfc2231(value, 'utf-8') value = '%s="%s"' % (name, value)

It is OK now.

Lukasa commented 8 years ago

The right data is: Content-Disposition: form-data; name="file"; filename="中文.txt"

That is not the right data. This is:

Content-Disposition: form-data; name="file"; filename*=utf-8''%E4%B8%AD%E6%96%87.txt

This uses RFC 2231 already. To achieve it, change your code to this:

files = {'file': open(u'中文.txt', 'rb')}
values = {'name': r'中文.txt', 'filesize': '101'}

url = 'xxxx'
r = requests.post(url, files=files, data=values, headers=headers)
pc10201 commented 8 years ago

The fiddler right data screenshot

img1

img2

not like filename*=utf-8''%E4%B8%AD%E6%96%87.txt

Lukasa commented 8 years ago

@pc10201 Did you change your code to match mine, including the unicode string for opening the file?

pc10201 commented 8 years ago

I will try tomorrow.I will tell you result.

pc10201 commented 8 years ago

I tried your code. It do not work img3 img4

If you use windows,you can use fiddler to capture http data. if you use linux or mac os,you can use wireshark.

Lukasa commented 8 years ago

@pc10201 In what sense is that not working? I see the filename field formatted correctly.

pc10201 commented 8 years ago

The server response is not right. The right data

img1

The bad data img3

Please check the third line with each image.

Lukasa commented 8 years ago

Assuming for a moment that the different filenames are unrelated to this problem, the issue here is that the "right" case is not using RFC 2231 encoding. This is common, but wrong: the server needs to be able to handle the RFC 2231-encoded filename, which is what we're providing.

pc10201 commented 8 years ago

You are right.The server may cause this problem.Please close this issue.