Closed di closed 10 months ago
I can't seem to reproduce this.
Simple server setting content-type header on location/redirect response:
from http.server import HTTPServer, BaseHTTPRequestHandler
class Handler(BaseHTTPRequestHandler):
protocol_version = 'HTTP/1.1'
def do_GET(self):
self.send_response(301)
self.send_header('content-type', 'application/json')
self.send_header('content-length', '0')
self.send_header('location', 'https://pypi.org/static/images/logo-small.2a411bc6.svg')
self.end_headers()
httpd = HTTPServer(('localhost', 8000), Handler)
httpd.serve_forever()
~% curl -v http://127.0.0.1:8000/image.svg
* Trying 127.0.0.1:8000...
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> GET /image.svg HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/8.1.2
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Server: BaseHTTP/0.6 Python/3.11.5
< Date: Thu, 31 Aug 2023 00:13:28 GMT
< content-type: application/json
< content-length: 0
< location: https://pypi.org/static/images/logo-small.2a411bc6.svg
<
* Connection #0 to host 127.0.0.1 left intact
go-camo% ./build/bin/url-tool -k test encode -b base64 -p http://127.0.0.1:8080 http://127.0.0.1:8000/image.svg
http://127.0.0.1:8080/ZjI9U0gzk_7pETckVGbk3ttQZQA/aHR0cDovLzEyNy4wLjAuMTo4MDAwL2ltYWdlLnN2Zw
go-camo% ./build/bin/go-camo -v -k test --listen 127.0.0.1:8080
~% curl -sv -o /dev/null http://127.0.0.1:8080/ZjI9U0gzk_7pETckVGbk3ttQZQA/aHR0cDovLzEyNy4wLjAuMTo4MDAwL2ltYWdlLnN2Zw
* Trying 127.0.0.1:8080...
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> GET /ZjI9U0gzk_7pETckVGbk3ttQZQA/aHR0cDovLzEyNy4wLjAuMTo4MDAwL2ltYWdlLnN2Zw HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/8.1.2
> Accept: */*
>
< HTTP/1.1 200 OK
< Accept-Ranges: bytes
< Cache-Control: max-age=315360000, public, immutable
< Content-Length: 54786
< Content-Security-Policy: default-src 'none'; img-src data:; style-src 'unsafe-inline'
< Content-Type: image/svg+xml
< Date: Thu, 31 Aug 2023 00:12:25 GMT
< Etag: "64b6a30e-d602"
< Last-Modified: Tue, 18 Jul 2023 14:34:54 GMT
< Server: go-camo
< X-Content-Type-Options: nosniff
< X-Xss-Protection: 1; mode=block
<
{ [1378 bytes data]
* Connection #0 to host 127.0.0.1 left intact
go-camo% ./build/bin/go-camo -v -k test --listen 127.0.0.1:8080
time="2023-08-30T17:12:21.746636000-07:00" level="D" msg="debug logging enabled"
time="2023-08-30T17:12:21.746764000-07:00" level="I" msg="Starting HTTP server on: tcp:127.0.0.1:8080"
time="2023-08-30T17:12:26.004294000-07:00" level="D" msg="client request" content_length="0" header="map[Accept:[*/*] User-Agent:[curl/8.1.2]]" host="127.0.0.1:8080" method="GET" path="/ZjI9U0gzk_7pETckVGbk3ttQZQA/aHR0cDovLzEyNy4wLjAuMTo4MDAwL2ltYWdlLnN2Zw" proto="HTTP/1.1" remote_addr="127.0.0.1:57558" transfer_encoding="[]"
time="2023-08-30T17:12:26.004365000-07:00" level="D" msg="signed client url" url="http://127.0.0.1:8000/image.svg"
time="2023-08-30T17:12:26.004391000-07:00" level="D" msg="built outgoing request" content_length="0" header="map[Accept:[image/*] User-Agent:[go-camo] Via:[go-camo]]" host="127.0.0.1:8000" method="GET" path="/image.svg" proto="HTTP/1.1" remote_addr="" transfer_encoding="[]"
time="2023-08-30T17:12:26.038259000-07:00" level="D" msg="response from upstream" content_length="54786" header="map[Accept-Ranges:[bytes] Access-Control-Allow-Origin:[*] Cache-Control:[max-age=315360000, public, immutable] Connection:[keep-alive] Content-Length:[54786] Content-Type:[image/svg+xml] Date:[Thu, 31 Aug 2023 00:12:26 GMT] Etag:[\"64b6a30e-d602\"] Last-Modified:[Tue, 18 Jul 2023 14:34:54 GMT] Strict-Transport-Security:[max-age=31536000; includeSubDomains; preload] Vary:[Accept-Encoding] X-Cache:[HIT, HIT] X-Cache-Hits:[293, 1] X-Content-Type-Options:[nosniff] X-Frame-Options:[deny] X-Permitted-Cross-Domain-Policies:[none] X-Served-By:[cache-iad-kiad7000117-IAD, cache-pdx12329-PDX] X-Timer:[S1693440746.082145,VS0,VE2] X-Xss-Protection:[1; mode=block]]" proto="HTTP/1.1" status="200" transfer_encoding="[]"
time="2023-08-30T17:12:26.041701000-07:00" level="D" msg="response to client" headers="map[Accept-Ranges:[bytes] Cache-Control:[max-age=315360000, public, immutable] Content-Length:[54786] Content-Security-Policy:[default-src 'none'; img-src data:; style-src 'unsafe-inline'] Content-Type:[image/svg+xml] Date:[Thu, 31 Aug 2023 00:12:25 GMT] Etag:[\"64b6a30e-d602\"] Last-Modified:[Tue, 18 Jul 2023 14:34:54 GMT] Server:[go-camo] X-Content-Type-Options:[nosniff] X-Xss-Protection:[1; mode=block]]" status="200"
Without seeing logs or being able to reproduce this with the simple setup above, I can offer a few hypotheticals:
Was my attempt at reproduction about what you had envisioned as repro steps?
Do you have any further information/logs/etc on the issue?
Perhaps the server in question was redirecting more than 3 times. MaxRedirects is a go-camo cli flag tunable, but the default redirection limit is configured to 3.
I don't think this is it. The original URL in question was https://api.securityscorecards.dev/projects/github.com/di/id/badge, which only issues a single redirect to a URL that responds with a 200:
The response from the proxy was:
$ curl -v https://pypi-camo.global.ssl.fastly.net/ac31ea219643944969bd06dca6dc02a6b4d6dc06/68747470733a2f2f6170692e736563757269747973636f726563617264732e6465762f70726f6a656374732f6769746875622e636f6d2f64692f69642f6261646765
...
< HTTP/1.1 404 Not Found
< Connection: keep-alive
< Content-Length: 10
< Content-Type: text/plain; charset=utf-8
< Content-Security-Policy: default-src 'none'; img-src data:; style-src 'unsafe-inline'
< X-Content-Type-Options: nosniff
< X-Xss-Protection: 1; mode=block
< Accept-Ranges: bytes
< Date: Wed, 30 Aug 2023 14:13:47 GMT
< Via: 1.1 varnish
< Age: 0
< X-Served-By: cache-fty21335-FTY
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1693404827.906100,VS0,VE137
< Strict-Transport-Security: max-age=300
<
Not Found
(note that this now works as expected)
The response from the original URL was:
$ curl -v https://api.securityscorecards.dev/projects/github.com/di/id/badge
...
< HTTP/2 302
< content-type: application/json
< location: https://img.shields.io/ossf-scorecard/github.com/di/id?label=openssf scorecard&style=flat
< vary: Origin
< x-cloud-trace-context: 7afcef1c21068ac089b2880adcbdeb5a
< alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
< x-envoy-decorator-operation: ingress GetBadge
< date: Wed, 30 Aug 2023 14:08:36 GMT
< server: Google Frontend
< content-length: 0
(note that the Content-Type
here has since changed)
Perhaps there was something else wrong with the response, such as being an http/1.1 response without a content-length, and the Go http library may have been refusing to process it for some reason. (logs would hopefully be informative here)
Looks like both responses here had content-length
, so maybe we can rule this out.
If go-camo is configured to use an outgoing proxy (eg. smokescreen, squid), perhaps that proxy was rejecting the redirect response for some reason.
Nope, not configured to use an outgoing proxy.
Some other heretofore unknown bug in go-camo doing something unexpected.
Since changing the Content-Type
of the redirect has resolved the issue, I definitely think it's related to this. My read of https://github.com/cactus/go-camo/blob/4d65728288768aeaf34577a9bbe18072aa910af0/pkg/camo/proxy.go#L483 is that the the Content-Type
would be evaluated against acceptTypes
every time a redirect is followed, but maybe I'm mis-reading that.
My guess is that maybe something changed here between our fork and what you're testing against, although I don't see anything obvious that would be affecting this.
Was my attempt at reproduction about what you had envisioned as repro steps?
Yes, I think it's accurate, aside from the original response being a 302 and not a 301 (although it doesn't seem to matter)
Do you have any further information/logs/etc on the issue?
Unfortunately this instance receives a lot of traffic and I'm unable to extract logs specifically for this edge case, hopefully the above will suffice.
Nothing really jumps out at me in the diff between your fork and here either.
As far as code flow goes:
This is the function that validates redirects: https://github.com/cactus/go-camo/blob/4d65728288768aeaf34577a9bbe18072aa910af0/pkg/camo/proxy.go#L592-L608
And the above function really just checks for redirect depth, and does some url checks (avoiding things like redirects as SSR vectors), calling this function for the url checks: https://github.com/cactus/go-camo/blob/master/pkg/camo/proxy.go#L391-L439
The net Dialer is involved a bit as well when following redirects (connecting to new hostnames), ensuring that hostnames/dns don't resolve into SSR vectors either: https://github.com/cactus/go-camo/blob/4d65728288768aeaf34577a9bbe18072aa910af0/pkg/camo/proxy.go#L499-L526
None of that has anything to do with content-type checking though. The Content-type checking happens here (https://github.com/cactus/go-camo/blob/master/pkg/camo/proxy.go#L259-L295) as part of 20x level responses. 30x level responses should be auto-followed unless one of the aforementioned checks (redirect depth, url/SSR, hostname/SRR) fails. If it does fail, it ends up here https://github.com/cactus/go-camo/blob/master/pkg/camo/proxy.go#L300-L304
Just to see if there was some strange issue with http2 and Go itself with redirects and headers in http2 responses, I setup another test server with an http/2 endpoint, returning the same location target url as you noted above:
< HTTP/2 302
< server: nginx
< date: Thu, 31 Aug 2023 03:18:14 GMT
< content-type: application/json
< content-length: 0
< location: https://img.shields.io/ossf-scorecard/github.com/di/id?label=openssf scorecard&style=flat
< strict-transport-security: max-age=31536000
< x-content-type-options: nosniff
< content-security-policy: default-src 'self';style-src 'self' 'unsafe-inline';img-src 'self' data:;object-src 'none';frame-ancestors 'self';upgrade-insecure-requests;base-uri 'self'; form-action 'none';
< x-xss-protection: 1; mode=block
< referrer-policy: same-origin
< cache-control: public
< x-frame-options: SAMEORIGIN
< permissions-policy: interest-cohort=()
No issues. go-camo processed it just fine in my single-request attempt at reproduction.
What version of Go are you using?
go-camo -V
should output what was used to build it. Something like this:
./build/bin/go-camo -V
go-camo 2.4.4 (go1.21.0,gc-arm64)
Thanks for the detailed analysis. I'm going to go ahead and close this and chalk it up to something different on our fork, since the original service was able to update the content type this hasn't reoccured and I don't have as much of a need to dig into why it was happening.
Thanks again!
@di Sounds good, appreciate for the follow up. 👍
Specifications
Please list the go-camo version, as well as the Operation System (and version) that go-camo is running on. The go-camo version can be found by
go-camo -V
.Version: v2.4.3 (we are actually running a fork, though: https://github.com/pypi/camo) Platform: Linux
Expected Behavior
go-camo
does not consider theContent-Type
of redirects as it follows the redirects, only theContent-Type
of the final response. Since it's a bit ambiguous what a validContent-Type
for a redirect is,go-camo
should not error out based on theContent-Type
of a redirect response.Actual Behavior
The application returns a 404 Not Found as soon as it encounters a redirect response with a non-image
Content-Type
.Steps to reproduce
I don't have an example online to try this against anymore (because the image hosting service which produced this behavior has since been updated to return a different
Content-Type
) but this could be reproduced with a simple HTTP service that responds with a 30x redirect and sets theContent-Type
header to something likeapplication/json
.