Open napat1412 opened 3 years ago
If it helps, I resolved this issue by using the following extension. I can probably make a PR with these changes some time soon ... I tested this pretty thoroughly with all kinds of crazy query string characters, but the more eyes on it the better.
class FixedAWS4Auth(AWS4Auth):
"""The current implementation of AWS4Auth is badly broken.
When constructing the canonical query string, the first thing it
does is decode the entire URL. This is wrong right out of the gate.
Consider, for example, that someone needs to send a query parameter
that contains a value of '=' or '&'. The properly encoded query string
will come in to amz_cano_querystring as
a=%26%3D
However, as soon as it is decoded, we'll get a=&=, which will completely
mess up the parsing and give 'a' a blank value. The correct solution is to
use parse_qs, which properly parses and decodes the query string.
There are also some issues with how '+' is interpreted. Sigv4 stipulates
that spaces must always be encoded as %20, and that a '+' must be encoded
as %2B, so the canonical query string should never end up with '+'
character. The current implementation is not guaranteeing this, as
it is using urllib.parse.quote, which encodes spaces as '+' by default.
"""
@staticmethod
def amz_cano_querystring(qs):
"""
Parse and format querystring as per AWS4 auth requirements.
Perform percent quoting as needed.
qs -- querystring
"""
# If Python 2, switch to working entirely in str
# as urlencode() has problems with Unicode
if PY2:
qs = qs.encode('utf-8')
# URL is already encoded at this point, parse to get individual params.
# This will interpret '+' as a space, which is not configurable in Python 2.
# This is fine, since requests does the same thing.
# We must supply keep_blank_values as true here, or else they will be
# omitted entirely. Sigv4 expects blank values to appear in the canonical
# query string as 'var='.
params = parse_qs(qs, keep_blank_values=True)
params = sorted(list(params.items()), key=lambda item: item[0])
# Multi values are handled by doseq, but they need to be lex sorted within themselves.
# This is not explicit in the sigv4 doc, but determined via experimentation.
# doseq will preserve the order we set here.
params = [(k, list(sorted(v))) for k, v in params]
qs = urlencode(params, doseq=True)
# Sigv4 requires we never see '+' in the query string. Spaces must
# be %20, and an actual '+' must be encoded as '%2B'. urlencode can be
# configured to quote without using '+', but only in Python 3. This makes
# it a bit more explicit and loosens the dependency on Python version.
qs = qs.replace('+', '%20')
if PY2:
qs = unicode(qs)
return qs
feel free to PR it 👍 , it's okay to drop py2 support too.
I found problem about oder of canonicail querystring. It is not sorted correctly
Expected behavior
Querystring keys consist of format, key-type and key. After sorted, querystring keys should be format, key, key-type. (key must sorted before key-type). I think you need to rewrite the sort method.