Open jamshid opened 7 years ago
Here's a workaround that doesn't involved modifying the requests-aws4auth source code. Use the following wrapper class in place of the AWS4Auth
class. It encodes the headers created by AWS4Auth
into byte strings thus avoiding the UnicodeDecodeError
downstream.
from requests_aws4auth import AWS4Auth
class AWS4AuthEncodingFix(AWS4Auth):
def __call__(self, request):
request = super(AWS4AuthEncodingFix, self).__call__(request)
for header_name in request.headers:
self._encode_header_to_utf8(request, header_name)
return request
def _encode_header_to_utf8(self, request, header_name):
value = request.headers[header_name]
if isinstance(value, unicode):
value = value.encode('utf-8')
if isinstance(header_name, unicode):
del request.headers[header_name]
header_name = header_name.encode('utf-8')
request.headers[header_name] = value
I'm also seeing this bug with requests 2.18.4 (the latest as of today) and requests-aws4auth 0.9 on Python 2.7, when the body of the HTTP request isn't 7-bit-clean ASCII. It looks like requests doesn't expect header names to be Unicode, and at some point it ends up combining the Unicode headers with a UTF-8 encoded body, failing to decode the body with the default 'ascii' encoding.
Another fix would be to remove the from __future__ import unicode_literals
declaration, but that's farther-reaching than just encoding the header keys and values.
Sending a non-ascii request body using Python 2.7 fails when using requests-aws4auth. I thought it was a general
requests
bug at first (https://github.com/kennethreitz/requests/issues/3875) but it only happens with requests-aws4auth. I'm seeing this on Python 2.7.5 on centos 7.2 and macOS.After some debugging, it seems to be triggered by string literals being forced to "unicode" in /usr/lib/python2.7/site-packages/requests_aws4auth/aws4auth.py.
FIX/WORKAROUND: comment out that line.
The problem is
requests
doesn't seem to expect the HTTP request headers to contain unicode strings. Python 2.7 "unicode+str" weirdness causesrequest_headers + request_body
to fail becauserequest_body
is already a binary(?) string.Btw I don't think
aws4auth
should be doing an .encode('utf-8') -- it should already be "bytes", right? At least HTTPBasicAuth and S3Auth expect the client calling requests.put() to passdata
already encoded to utf-8 bytes.Finally, maybe this is still a bug in
requests
or python httplib.py? Should it allow unicode string headers, containing only ascii (or iso-8859-1?), and /usr/lib64/python2.7/httplib.py_send_output()
should forcemsg
tostr
before appending the requestbody
?Reproduction:
That should work, and it does when using
requests.auth.HTTPBasicAuth
or S3 V2 signature packageawsauth.S3Auth
. But requests-aws4auth gets exception:The "!!!" lines are debugging output I added to /usr/lib64/python2.7/httplib.py _send_output()