okigan / awscurl

curl-like access to AWS resources with AWS Signature Version 4 request signing.
MIT License
737 stars 91 forks source link

URL encode non-alpha querystring parameters #139

Closed varju closed 2 years ago

varju commented 2 years ago

Fixes #138

okigan commented 2 years ago

@varju seems github added manual approval before commit checks are run for new commuters -- I've approved that for you.

Thanks for PR -- have a look at 3.6 build above (and I think mac builds will run once that is resolved).

varju commented 2 years ago

Thanks for PR -- have a look at 3.6 build above

Darn. I included examples in that test of all the character classes that AWS says should not be URL encoded. It looks like Python 3.6's URL encoder is encoding the ~ character.

Given that 3.6 hits end of life at the end of this month, is there any chance you'd consider dropping support for it? The alternative would be hand-rolling a URL encoder similar to what the python-aws-sig project did, and that feels a bit heavy.

okigan commented 2 years ago

Hmm, interesting - I am not opposed to that (weird python lib is making this kinds of changes).

Let me relook at currently tested versions, then this can rebased onto that.

Thanks!

On Dec 4, 2021, at 10:44 AM, Alex Varju @.***> wrote:

 Thanks for PR -- have a look at 3.6 build above

Darn. I included examples in that test of all the character classes that AWS says should not be URL encoded. It looks like Python 3.6's URL encoder is encoding the ~ character.

Given that 3.6 hits end of life at the end of this month, is there any chance you'd consider dropping support for it? The alternative would be hand-rolling a URL encoder similar to what the python-aws-sig project did, and that feels a bit heavy.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

varju commented 2 years ago

Just reading some docs some more, I think the encoding change happened in 3.7. The urllib.parse.quote docs say:

Changed in version 3.7: Moved from RFC 2396 to RFC 3986 for quoting URL strings. “~” is now included in the set of unreserved characters.

That said, I think we can implement AWS-compatible encoding in a way that doesn't require quite as much code as that link I shared earlier, and with support for Python 3.6.

What do you think of the latest push?

okigan commented 2 years ago

All checks passed - nice!, I’ll re-review when I get home later today.

okigan commented 2 years ago

looks good to me

okigan commented 2 years ago

@varju please confirm all is resolved. Thanks

varju commented 2 years ago

Yes, everything is working great now! Thanks for the quick response.

kibiz0r commented 2 years ago

What if the caller has already percent-encoded the query string?

(Spoiler: Our scripts are doing just that.)

In our case, we can simply stop doing the percent-encoding on our end, which is actually a win for us. But just FYI, this might introduce a headache for people that don't have that option. Browsers seem to only percent-encode pasted URLs "if necessary". Might be good to take a similar approach here.