okigan / awscurl

curl-like access to AWS resources with AWS Signature Version 4 request signing.
MIT License
737 stars 91 forks source link

Commas in querystring are not URL encoded #138

Closed varju closed 2 years ago

varju commented 2 years ago

I've been using awscurl to call API Gateway endpoints, and it has been working great for us up until now (thanks very much for making this!).

I recently started working on an endpoint that takes comma-delimited arrays, and found that the signature is not working properly in this case.

Example failure:

$ awscurl -v https://my-gateway-id.execute-api.us-east-1.amazonaws.com/live/my-path?arg=a,b,c
...
('\n'
 'CANONICAL REQUEST = GET\n'
 '/live/my-path\n'
 'arg=a,b,c\n'
 'host:my-gateway-id.execute-api.us-east-1.amazonaws.com\n'
 'x-amz-date:20211202T033321Z\n'
 'x-amz-security-token:IQoJb3JpZ2luX2VjEEsaCXVzLWVhc3QtMSJHMEUCIQC4m8wJYpB8Xch/XdJzdKHVbgT9refFSvzJ3CeRD9HYfQIgbWPPWfYZ6bYDupK/2SIbvkxruT8ssDn4Kl0qWcCxcF0qnAMIIxAAGgw3OTkyNDI4NjU0OTEiDALfa7r3/U7W1vFxMCr5AuxtcoYbB0iq58HexqEgxxe95yduKHkab03fDBNzwWl+ZG2ATBekfJLvi7Rhzr3U2uNckMXG32x3Nj/nirtwbqfecjqLj/7XfPhBNKgXzEuJ5bQ+oIUP3heXj5G8tJ/Yvwe0eRgghiLjarwzBzvMrA7u9o0jvIGTGHZqTMM0uUPkJrDWoGpWic73fvbe7GHG0j6Hcyy/agWmgLtG7NHh5GcAwdnZih8g9nzMOO5GL0wiu805NSo6xWc2BqnP/uyHQNRAxOc3Uxd+dYtqh6NosNvX3H5x8Gzj383hB3XDbsgZiRvvk+EMwlcZlsFMg+6Mg9abx516vpBgGaKsf5yoWhK4tnsvUnNsgF+2gwBjb93z6eKP7GdHyB3iirtetQcscwkFA6mt4cl1gz8FTNcfCN6BGPQhtUO+HXfpnOY8iapateIJnhjoibELSbcNYf9ql55Xm7bvOknGAJ6HCZvp3mpM/x/X7N6e+U0sdWtzVVFNt82cY/c3BknZML7UoI0GOqYB48W1RlChDe3KjQsHrKtTdgXBGdxU1IYbHdVSwzg9t3ZewwHEpVjfF77LQ0oSRD4ROuIC9cbAXXmr/QCPqpJpFaLtqFNlVCyLpAEygCLsXN+mxGQFj6laGmZR8nEVZrsnzZy1zhOEMOfhPZWQrULbaZeJ/ZtVWto3gnQ/Rl60H52jwf8ccU2EUijMLjlRgOH8MtAnS9WCbTKur07cxK4PXLVG21DTBw==\n'
 '\n'
 'host;x-amz-date;x-amz-security-token\n'
 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855')
...
{"message":"The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.\n\nThe Canonical String for this request should have been\n'GET\n/live/my-path\narg=a%2Cb%2Cc\nhost:my-gateway-id.execute-api.us-east-1.amazonaws.com\nx-amz-date:20211202T033321Z\nx-amz-security-token:IQoJb3JpZ2luX2VjEEsaCXVzLWVhc3QtMSJHMEUCIQC4m8wJYpB8Xch/XdJzdKHVbgT9refFSvzJ3CeRD9HYfQIgbWPPWfYZ6bYDupK/2SIbvkxruT8ssDn4Kl0qWcCxcF0qnAMIIxAAGgw3OTkyNDI4NjU0OTEiDALfa7r3/U7W1vFxMCr5AuxtcoYbB0iq58HexqEgxxe95yduKHkab03fDBNzwWl+ZG2ATBekfJLvi7Rhzr3U2uNckMXG32x3Nj/nirtwbqfecjqLj/7XfPhBNKgXzEuJ5bQ+oIUP3heXj5G8tJ/Yvwe0eRgghiLjarwzBzvMrA7u9o0jvIGTGHZqTMM0uUPkJrDWoGpWic73fvbe7GHG0j6Hcyy/agWmgLtG7NHh5GcAwdnZih8g9nzMOO5GL0wiu805NSo6xWc2BqnP/uyHQNRAxOc3Uxd+dYtqh6NosNvX3H5x8Gzj383hB3XDbsgZiRvvk+EMwlcZlsFMg+6Mg9abx516vpBgGaKsf5yoWhK4tnsvUnNsgF+2gwBjb93z6eKP7GdHyB3iirtetQcscwkFA6mt4cl1gz8FTNcfCN6BGPQhtUO+HXfpnOY8iapateIJnhjoibELSbcNYf9ql55Xm7bvOknGAJ6HCZvp3mpM/x/X7N6e+U0sdWtzVVFNt82cY/c3BknZML7UoI0GOqYB48W1RlChDe3KjQsHrKtTdgXBGdxU1IYbHdVSwzg9t3ZewwHEpVjfF77LQ0oSRD4ROuIC9cbAXXmr/QCPqpJpFaLtqFNlVCyLpAEygCLsXN+mxGQFj6laGmZR8nEVZrsnzZy1zhOEMOfhPZWQrULbaZeJ/ZtVWto3gnQ/Rl60H52jwf8ccU2EUijMLjlRgOH8MtAnS9WCbTKur07cxK4PXLVG21DTBw==\n\nhost;x-amz-date;x-amz-security-token\ne3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'\n\nThe String-to-Sign should have been\n'AWS4-HMAC-SHA256\n20211202T033321Z\n20211202/us-east-1/execute-api/aws4_request\n38b3266af08b8f9f119ffed742e0063da244001dbdf2331c42f4f1af695e3e64'\n"}

Reviewing sigv4-create-canonical-request.html, it says:

Percent-encode all other characters with %XY, where X and Y are hexadecimal characters (0-9 and uppercase A-F). For example, the space character must be encoded as %20 (not using '+', as some encoding schemes do) and extended UTF-8 characters must be in the form %XY%ZA%BC.

In my case, I believe that means it should be sending arg=a%2Cb%2Cc instead of arg=a,b,c

Testing locally, the following patch appears to fix things for me, but I'm not sure how much fallout this might cause for other use cases:

diff --git a/awscurl/awscurl.py b/awscurl/awscurl.py
index cf09318..66a14c4 100755
--- a/awscurl/awscurl.py
+++ b/awscurl/awscurl.py
@@ -16,6 +16,7 @@ import configparser
 import configargparse
 import requests
 from requests.structures import CaseInsensitiveDict
+from urllib.parse import urlencode

 from .utils import sha256_hash, sha256_hash_for_binary_data, sign
@@ -315,9 +316,9 @@ def __normalize_query_string(query):
                        for s in query.split('&')
                        if len(s) > 0)

-    normalized = '&'.join('%s=%s' % (p[0], p[1] if len(p) > 1 else '')
-                          for p in sorted(parameter_pairs))
-    return normalized
+    normalized_pairs = [(p[0], p[1] if len(p) > 1 else '')
+                        for p in [p for p in sorted(parameter_pairs)]]
+    return urlencode(normalized_pairs)

 def __now():
okigan commented 2 years ago

@varju oh! that sounds like a good addition/correction. Could you please make it into PR with unit test -- I can take it from there. Thanks!