Open olivierdalang opened 1 year ago
Hmm, good question. I think the rewrite module decides whether it should rewrite the query based on the original string being input, before doing placeholder replacement, by checking if it contains a ?
.
This is definitely tricky. There are security considerations here, because not all placeholders contain trusted values, and in many configs there are implicit rewrite * {path}
that happen (e.g. with try_files
and via php_fastcgi
) so we need to be very careful here about not allowing arbitrary clients to inject a ?
to affect the behaviour of the rewrite.
Hm, if the rewrite destination has a ?
it should treat what comes after as a query string. Maybe it should work as long as the placeholder doesn't span different parts of the URI (e.g. path and also QS).
As a workaround, would it be possible for your app to send the path and query as different headers for now? Then it should be easy to merge them together with a rewrite with Caddy.
Thanks for your answers.
@francislavoie Unfortunately, even treating path and query with different header does not solve it, as the query part still gets URL-encoded(rewrite * /hello?{rp.header.X-Accel-Redirect}
-> http://localhost/hello?some%3Dparam%26some_other%3Dparam
)
I agree automatic encoding makes sense for untrusted variables. Some ideas about how this could be implemented:
trust_var {rp.header.X-Accel-Redirect} trusted_redirect_string
, then rewrite * trusted_redirect_string
)trusted
argument to rewrite
allowing to specify that the parameters are to be trustedStill very much interested about any ideas for workaround if you have any (at this point, the only thing I found is url-decoding the string manually with
uri replace %3D =
uri replace %26 &
uri replace %2C ,
after my rewrite block, but that's super dirty and will break as soon as there are new special characters in my query strings.
l'm trying to do something similar. I have a Python app, which stores files in a S3-like service. I want to have permanent links to those files, but the S3 service uses transient URLs like: https://cloud.canary.kaikosystems.com/media/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=REDACTED%2Ffr-par%2Fs3%2Faws4_request&X-Amz-Date=20230119T143400Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=redacted
So I have configured Caddy with
(s3_media) {
# {args.0}: is the name of the Environment. Ex: production, demo, staging
handle_path /media/* {
# See https://github.com/caddyserver/caddy/issues/5208
uri replace %3D =
uri replace %26 &
uri replace %2C ,
uri replace %2D -
uri replace %2F /
header X-S3-URI {uri}
reverse_proxy https://kaiko-{args.0}.s3.fr-par.scw.cloud {
header_up Host kaiko-{args.0}.s3.fr-par.scw.cloud
}
}
}
(dashboard_permalinks) {
# {args.0}: is the name of the Environment. Ex: production, demo, staging
encode zstd gzip
# this settings is something related to the DashOnBoard
# not strictly neccesary
header +Vary Origin
log {
format console
level INFO
}
import robots
@dyn not path /static/* /media/* /.well-known/*
reverse_proxy @dyn {args.0}_web:8000 {
header_up X-Real-IP {remote}
@accel header X-Accel-Redirect *
handle_response @accel {
rewrite * {rp.header.X-Accel-Redirect}
header X-Original-Accel-Redirect {rp.header.X-Accel-Redirect}
import s3_media {args.0}
}
}
handle_path /static/* {
reverse_proxy {args.0}_statics:80
}
import s3_media {args.0}
}
cloud.canary.kaikosystems.com {
import dashboard_permalinks canary
}
The Python returns a response with both a Location
and the X-Accel-Redirect
. The permalink URL has the following path pattern /attachment/<uuid>
.
When I request one of those URLs, I'm getting a 403 with the following headers:
< x-original-accel-redirect: /media/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=REDACTED2F20230119%2Ffr-par%2Fs3%2Faws4_request&X-Amz-Date=20230119T143855Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=REDACTED
< x-s3-uri: /attachment/01835a39-af47-7a7e-a7d5-6d7063fb7185
I notice that header X-S3-URI {uri}
doesn't pick up the rewrite
. Is it because header
directives are processed before any rewriting?
Directives are sorted according to this order: https://caddyserver.com/docs/caddyfile/directives#directive-order. Notice that uri
is after header
. You can either set the defer
option of the header directive so that the header operation is done on the way out of the middleware chain, or wrap it in a route
to override the order.
@francislavoie Thanks for your response.
Unfortunately, using defer
like this:
(s3_media) {
# {args.0}: is the name of the Environment. Ex: production, demo, staging
handle_path /media/* {
# See https://github.com/caddyserver/caddy/issues/5208
uri replace %3D =
uri replace %26 &
uri replace %2C ,
uri replace %2D -
uri replace %2F /
header X-S3-URI {uri} {defer}
reverse_proxy https://kaiko-{args.0}.s3.fr-par.scw.cloud {
header_up Host kaiko-{args.0}.s3.fr-par.scw.cloud
}
}
}
has the effect to remove the header entirely from the response.
@mvaled That:
header X-S3-URI {uri} {defer}
replaces all instances of the {uri}
with the contents of {defer}
, which would be nothing unless you've defined a {defer}
placeholder somewhere.
You probably meant:
header X-S3-URI {uri} {
defer
}
@mholt Thanks!
I did like you suggested and now I'm getting the X-S3-URI
again, but once again, with the non-rewritten URI.
I activated the global debug
to the URI the reverse was being sent:
{
"level": "debug",
"ts": 1674160975.7986755,
"logger": "http.handlers.reverse_proxy",
"msg": "upstream roundtrip",
"upstream": "canary_web:8000",
"duration": 0.087298235,
"request": {
"remote_ip": "51.15.194.75",
"remote_port": "56242",
"proto": "HTTP/2.0",
"method": "GET",
"host": "cloud.canary.kaikosystems.com",
"uri": "/attachment/01835a39-af47-7a7e-a7d5-6d7063fb7185",
"headers": {
"X-Real-Ip": [
"51.15.194.75:56242"
],
"Accept": [
"*/*"
],
"User-Agent": [
"curl/7.87.0"
],
"X-Forwarded-For": [
"51.15.194.75"
],
"X-Forwarded-Proto": [
"https"
],
"X-Forwarded-Host": [
"cloud.canary.kaikosystems.com"
]
},
"tls": {
"resumed": false,
"version": 772,
"cipher_suite": 4865,
"proto": "h2",
"server_name": "cloud.canary.kaikosystems.com"
}
},
"headers": {
"Content-Length": [
"0"
],
"X-Content-Type-Options": [
"nosniff"
],
"Cross-Origin-Opener-Policy": [
"same-origin"
],
"Vary": [
"Origin"
],
"Server": [
"uvicorn"
],
"Content-Type": [
"text/html; charset=utf-8"
],
"X-Frame-Options": [
"DENY"
],
"Referrer-Policy": [
"same-origin"
],
"Date": [
"Thu, 19 Jan 2023 20:42:54 GMT"
],
"Location": [
"https://cloud.canary.kaikosystems.com/media/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SCW9MDFB3PJGAQ5GBWZA%2F20230119%2Ffr-par%2Fs3%2Faws4_request&X-Amz-Date=20230119T203023Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=98c2380284b8f5b886decb676e1e4f8108a3dfc0bd633db3a9aa862f5633008b"
],
"X-Accel-Redirect": [
"/media/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SCW9MDFB3PJGAQ5GBWZA%2F20230119%2Ffr-par%2Fs3%2Faws4_request&X-Amz-Date=20230119T203023Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=98c2380284b8f5b886decb676e1e4f8108a3dfc0bd633db3a9aa862f5633008b"
]
},
"status": 302
}
Then come two logs from the rewriter, and we can see the problem there. The first rewrite is our rewrite * {rq.header.X-Accel-Redirect}
. It already lacks the query string:
{
"level": "debug",
"ts": 1674160975.7993755,
"logger": "http.handlers.rewrite",
"msg": "rewrote request",
"request": {
"remote_ip": "51.15.194.75",
"remote_port": "56242",
"proto": "HTTP/2.0",
"method": "GET",
"host": "cloud.canary.kaikosystems.com",
"uri": "/attachment/01835a39-af47-7a7e-a7d5-6d7063fb7185",
"headers": {
"Accept": [
"*/*"
],
"User-Agent": [
"curl/7.87.0"
]
},
"tls": {
"resumed": false,
"version": 772,
"cipher_suite": 4865,
"proto": "h2",
"server_name": "cloud.canary.kaikosystems.com"
}
},
"method": "GET",
"uri": "/media/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg"
}
The second rewrite simply removes the '/media' from the URL:
{
"level": "debug",
"ts": 1674160975.799484,
"logger": "http.handlers.rewrite",
"msg": "rewrote request",
"request": {
"remote_ip": "51.15.194.75",
"remote_port": "56242",
"proto": "HTTP/2.0",
"method": "GET",
"host": "cloud.canary.kaikosystems.com",
"uri": "/media/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg",
"headers": {
"Accept": [
"*/*"
],
"User-Agent": [
"curl/7.87.0"
]
},
"tls": {
"resumed": false,
"version": 772,
"cipher_suite": 4865,
"proto": "h2",
"server_name": "cloud.canary.kaikosystems.com"
}
},
"method": "GET",
"uri": "/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg"
}
Finally comes the internal redirect:
{
"level": "debug",
"ts": 1674160976.0522823,
"logger": "http.handlers.reverse_proxy",
"msg": "upstream roundtrip",
"upstream": "kaiko-canary.s3.fr-par.scw.cloud:443",
"duration": 0.2523489,
"request": {
"remote_ip": "51.15.194.75",
"remote_port": "56242",
"proto": "HTTP/2.0",
"method": "GET",
"host": "kaiko-canary.s3.fr-par.scw.cloud",
"uri": "/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg",
"headers": {
"X-Forwarded-Proto": [
"https"
],
"X-Forwarded-Host": [
"cloud.canary.kaikosystems.com"
],
"User-Agent": [
"curl/7.87.0"
],
"Accept": [
"*/*"
],
"X-Forwarded-For": [
"51.15.194.75"
]
},
"tls": {
"resumed": false,
"version": 772,
"cipher_suite": 4865,
"proto": "h2",
"server_name": "cloud.canary.kaikosystems.com"
}
},
"headers": {
"Content-Type": [
"application/xml"
],
"Date": [
"Thu, 19 Jan 2023 20:42:56GMT"
],
"X-Amz-Id-2": [
"tx115f1623206f43fa8f651-0063c9ab4f"
],
"X-Amz-Request-Id": [
"tx115f1623206f43fa8f651-0063c9ab4f"
]
},
"status": 403
}
A little trick helps:
rewrite * {rp.header.X-Accel-Redirect}?
I did some thinking, and I think the best solution is to provide a new option to rewrite
to force-allow query replacement. See https://github.com/caddyserver/caddy/pull/5438, looks like this:
rewrite * {rp.header.X-Accel-Redirect} {
modify_query
}
I just wanted to add a working Caddyfile snippet I am using with Caddy 2.7.6:
@sendfile header X-Accel-Redirect *
handle_response @sendfile {
rewrite * {rp.header.X-Accel-Redirect}?
reverse_proxy https://dev-bucket.s3.amazonaws.com {
header_up Host {http.reverse_proxy.upstream.host}
header_down -Content-Encoding
header_down -Content-Type
header_down -Accept-Ranges
header_down -Cache-Control
header_down -Server
header_down -X-Amz-*
}
}
The problem with that is it'll append another ?
at the end of the query which is invalid syntax (it might cause the last query param to have ?
added to the value).
That's not what I see in my testing with Caddy 2.7.6 or master @francislavoie.
rewrite * {rp.header.X-Accel-Redirect}
produces a URI that is missing all query params
rewrite * {rp.header.X-Accel-Redirect}?
produces a correct URI with proper query params (and without a ?
at the end of the query string).
That might have to do with how rewrite decides whether to rewrite the query or not. If you put a question mark at the end it signals to the rewrite handler to also rewrite the query string. That's my guess :)
any tips on where I can add a test so we're all on the same page with this rewrite behavior (and so it doesn't get altered without a bit of discussion)? this X-Accel-Redirect with S3 seems like a pretty common use case
@ottenhoff You mean like a rewrite that is {placeholder}?
? That is already tested here:
@mholt that's not quite the same because {path}
is guaranteed to never contain a query, but {rp.header.X-Accel-Redirect}
could contain a query.
I'm still trying to figure out why this config works and produces a transparent redirect to the S3 asset with correct GET params (?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKQS...)
@sendfile header X-Accel-Redirect *
handle_response @sendfile {
rewrite * {rp.header.X-Accel-Redirect}?
reverse_proxy https://dev-bucket.s3.amazonaws.com {
header_up Host {http.reverse_proxy.upstream.host}
and this rewrite results in /path/to/s3/asset with all necessary AWS GET params stripped:
@sendfile header X-Accel-Redirect *
handle_response @sendfile {
rewrite * {rp.header.X-Accel-Redirect}
reverse_proxy https://dev-bucket.s3.amazonaws.com {
header_up Host {http.reverse_proxy.upstream.host}
@francislavoie Right; that part still needs a patch (I think you've had a PR open on it but I just haven't put the energy into reviewing it yet :sweat_smile: )
Hey ! I'm trying to configure a pretty simple X-Accel-Redirect kind of setup under Caddy 2.6.2, but think I encountered an issue that prevents me to do so.
My stripped down Caddyfile looks like this:
The
service_a
returns a response with a headerX-Accel-Redirect: /hello?some=param&some_other=param
. This is be picked by Caddy and rewritten so that this query is handled byservice_b
. The rewrite of the path works, but the GET params get swallowed and are not visible toservice_b
where I only see/hello
.It works as expected when I hardcode the same value in the caddy file like this (which IMO should be 100% equivalent).
Just to understand what's happening, I tried with some variants such as
rewrite * /hello?{rp.header.X-Accel-Redirect}
with just the paramters in the header. In this case,service_b
seeshttp://localhost/hello?some%3Dparam%26some_other%3Dparam
, so not good either.Seems some rogue urlencoding happens, but I couldn't find anything in the doc about that.
Any pointer about how to solve this ? I searched through the issues but could not find anything about this exactly.