awslabs / aws-sigv4-proxy

This project signs and proxies HTTP requests with Sigv4
Apache License 2.0
362 stars 103 forks source link

Problematic signature for S3 URLs requiring escaping #11

Open ethanrowe opened 5 years ago

ethanrowe commented 5 years ago

Uncovered while testing key existence checks from various clients (an old jets3t in an old hadoop distro, an old python boto release, a recent ruby AWS SDK release, various direct HEAD requests from curl): it appears that the proxy produces a problematic signature in cases where an S3 key contains keys requiring URL escaping.

Our initial exposure this came from hadoop's old S3 native filesystem abstraction, which ultimately looks for keys with the suffix _$folder$. The key existence check was getting a permission denied response, rather than a found/not-found response, despite being a resource to which the proxy's credentials clearly granted access.

To illustrate:

If we issue a HEAD request (as all the clients I mentioned above do in this scenario) without escaping the key, we should get the suitable found/not-found HTTP result (assuming your perms correct):

curl -s -H 'Host: awesome-bucket.s3.amazonaws.com' 'http://127.0.0.1:8080/foo_$folder$'

Supposing that foo_$folder$ doesn't exist in awesome-bucket, you ought to see something like this:

HTTP/1.1 404 Not Found
Content-Type: application/xml
Date: Fri, 17 May 2019 14:41:35 GMT
Server: AmazonS3
...

Now change the curl request to use the URI-escaped path instead (as the various clients mentioned above all do).

curl -s -H 'Host: awesome-bucket.s3.amazonaws.com' 'http://127.0.0.1:8080/foo_%24folder%24'

Now you'll get permission denied, even though this is actually the proper form of the request (at least according to my understanding).

HTTP/1.1 403 Forbidden
Content-Type: application/xml
Date: Fri, 17 May 2019 14:17:20 GMT
Server: AmazonS3
...

This makes me suspect that the signer is:

It's very strange. If you play around with different escape-warranting characters in the path, you'll find multiple scenarios in which you get a 403 Forbidden response, which seems indicative of a consistent signature problem.

I'll continue digging into this but am somewhat flummoxed. I note further that the $ character seems to have slightly different behavior than others; if you use a key foo_$ it'll be okay, but the escaped form foo_%24 isn't. But try a different character, like foo_<, and neither the escaped nor unescaped form works. To be clear, I also tested these without the proxy, using the latest AWS golang SDK release, and there were no signature issues; the problem appears to originate within the proxy itself.

tantona commented 5 years ago

Thank you for digging into this! the using the latest AWS golang SDK release, and there were no signature issues is curious - I'll take a peek at this this afternoon. Have you started poking at it in terms of a PR?

ethanrowe commented 5 years ago

I should note that as an experiment, I tweaked the proxy to blank out the request's RawPath field when handling s3 service requests (prior to signing). This was based on seeing this comment in the Go SDK.

Doing this actually fixed the behavior for the initial foo_$folder$ case described, but it actually remained broken in a variety of other cases so I abandoned the PR I was preparing in favor of, you know, actually understanding the issue. :)

ethanrowe commented 5 years ago

One thing I noticed along the way is the proxy is tied to SDK release 1.15.27; the SDK master branch is currently in the 1.19.x space. I'm going to play with the SDK signing behavior again using the 1.15.27 release to see if there's an issue.

ethanrowe commented 5 years ago

One thing I noticed along the way is the proxy is tied to SDK release 1.15.27; the SDK master branch is currently in the 1.19.x space. I'm going to play with the SDK signing behavior again using the 1.15.27 release to see if there's an issue.

I checked things out for the 1.15.27 release and found that the SDK signing behavior for S3 HEAD requests seemed fine; I got found/not-found responses instead of 403's. So it does seem like something's amiss internal to the proxy itself.

ethanrowe commented 5 years ago

Upon further investigation into the v4.Signer:

Having a separate instance of the signer, specifically for S3, is a somewhat invasive change in how the proxy is organized. I don't mind taking a crack at it, but would certainly welcome input beforehand.

This nasty hack seems to fix the problems I described earlier.

diff --git a/handler/proxy_client.go b/handler/proxy_client.go
index 4a93838..3937406 100644
--- a/handler/proxy_client.go
+++ b/handler/proxy_client.go
@@ -44,7 +44,10 @@ func (p *ProxyClient) sign(req *http.Request, service *endpoints.ResolvedEndpoin
                _, err = p.Signer.Sign(req, body, service.SigningName, service.SigningRegion, time.Now())
                break
        case "s3":
-               _, err = p.Signer.Presign(req, body, service.SigningName, service.SigningRegion, time.Duration(time.Hour), time.Now())
+               // See: https://github.com/aws/aws-sdk-go/blob/v1.19.31/aws/signer/v4/v4.go#L463-L466
+               // The URI path portion shouldn't be subject to escaping for S3 signatures.
+               _, err = v4.NewSigner(p.Signer.Credentials, func(s *v4.Signer) {s.DisableURIPathEscaping = true}).
+                       Presign(req, body, service.SigningName, service.SigningRegion, time.Duration(time.Hour), time.Now())
                break
        default:
                err = fmt.Errorf("unable to sign with specified signing method %s for service %s", service.SigningMethod, service.SigningName)