Closed hveiga closed 6 months ago
Hi @hveiga thank you so much for this contribution! I am going to make time to review/test tomorrow and get back to you with any comments.
So I'm still learning more about s3 Express and will be setting up a test for myself in the coming days to make sure I can review this properly. However after reading the PR I think there are some things for us to discuss:
"${S3_SERVER}:${S3_SERVER_PORT}"
to "${S3_BUCKET_NAME}.${S3_SERVER}:${S3_SERVER_PORT}"
since based on my understanding, the host of the path-style request should not have the bucket name. Is it possible that for your case you need to set S3_STYLE
to virtual
?default.conf
file since it will be hard to keep both in sync. We need to find a way accomplish this in one file using nginx conditional logic or njs (I can help with this once we understand item 1 above)S3_SERVER
configuration setting but maybe we can add some documentation around it as we reach a conclusion on this PR.So basically I'd like to dig in a bit to the changes in the hostname used in signing to make sure they are necessary (I couldn't find any docs that noted this but I could very well be missing soemthing). If they are we may need to restructure some of the logic in the default.conf
or utilize a js_set
handler to add some more complex conditional logic.
Your introduction of the S3_SERVICE
config variable and its default look good 👍
All your concerns are valid and agree with them. To be honest I am not too keen on this approach I presented but I wanted to get some progress going. Let me add some insights to your points:
I tried setting S3_STYLE
to virtual
but that did not work. The main reason is that when you have the S3_SERVER
set to S3 General endpoint like s3-us-west-2.amazonaws.com
the upstream works because it can accessed and nginx does not panic. When using S3 Express following the same logic, the endpoint for S3_SERVER
would be something like s3express-usw2-az1.us-west-2.amazonaws.com
. That unfortunately cannot be reached by nginx and panics. Maybe there is another combination I have not found to make this work, but I did try quite a few with no success except for having the full name: my-directory-bucket--usw2-az5--x-s3.s3express-usw2-az5.us-west-2.amazonaws.com:443
.
I am not in love with duplication but I did not want to complicate the env var substitution either. If we can add some conditional logic to keep a single file, I am all for it.
I have tried the zonal endpoints as well with no luck. Again, I may have missed something so another check from your side would be greatly appreciated.
I am not too familiar with js_set
in general, but let's use it if that simplifies the changes.
Thanks for the quick follow up @hveiga - sounds like our next task is to compare our understanding of the requirements for connection. I'm going to do some testing myself and report back then we can compare notes. This might take me a day or two but I'll try to prioritize it - very much appreciate your involvement in bringing this feature to the gateway
Hi @hveiga so I set up S3 Express One Zone bucket and I was able to confirm that we are actually able to use it without any changes to the gateway code. I've documented my findings here: https://github.com/nginxinc/nginx-s3-gateway/pull/229
Can you take a look and see if the described configuration works for your use case?
In the process, I learned some things about the gateway code that might be relevant:
upstream
is set by S3_SERVER
(https://github.com/nginxinc/nginx-s3-gateway/blob/master/oss/etc/nginx/templates/upstreams.conf.template) so bucket-name
is not automatically addeds3express
in the signatureI think at this point if we have a way of using the express zone product with the gateway then documenting it can be our first step. I'm trying to limit the number of additional configuration variables to the project until I have a chance to do a bigger refactor of how the project works. However, if there are any smaller usability changes we can make to improve the experience for s3 express please let me know
I tried what you are suggesting on #229 but unfortunately it does not seem to work for my bucket. To eliminate potential issues, I recreated the bucket with a different name and I double checked I can access it with my AWS credentials but still I get the same 404
error in the logs.
With debug enabled, I can see the host is incorrect (my-bucket-name
appears twice): my-bucket-name.my-bucket-name.s3express-usw2-az1.us-west-2.amazonaws.com
. The settings I am using should be identical to the other ones:
S3_BUCKET_NAME=my-bucket-name
AWS_ACCESS_KEY_ID="REDACTED"
AWS_SECRET_ACCESS_KEY="REDACTED"
AWS_SESSION_TOKEN="REDACTED"
S3_SERVER=my-bucket-name.s3express-usw2-az1.us-west-2.amazonaws.com
S3_SERVER_PORT=443
S3_SERVER_PROTO=https
S3_REGION=us-west-2
S3_STYLE=virtual
DEBUG=true
AWS_SIGS_VERSION=4
ALLOW_DIRECTORY_LIST=false
PROVIDE_INDEX_PAGE=false
APPEND_SLASH_FOR_POSSIBLE_DIRECTORY=false
DIRECTORY_LISTING_PATH_PREFIX=""
PROXY_CACHE_MAX_SIZE=10g
PROXY_CACHE_SLICE_SIZE="1m"
PROXY_CACHE_INACTIVE=60m
PROXY_CACHE_VALID_OK=1h
PROXY_CACHE_VALID_NOTFOUND=1m
PROXY_CACHE_VALID_FORBIDDEN=30s
For completeness, I am running ghcr.io/nginxinc/nginx-s3-gateway/nginx-oss-s3-gateway:latest-20240306
.
If we can have this working with no changes, that would be the best. Thank you for looking into this.
Hey so I just tested this again using the build you mentioned and was able to get it to work. I don't see an issue with the config you posted. I definitely am open to the possibility I'm missing something in my process, but looking at the error you're getting I am suspicious that the code from this pull request is involved. Here's the diff from your modified default.conf.template
--- a/common/etc/nginx/templates/default.conf.template
+++ b/common/etc/nginx/templates/default.conf.template
@@ -15,13 +15,13 @@ map $request_uri $uri_full_path {
# Remove/replace a portion of request URL (if configured)
map $uri_full_path $uri_path {
- "~^$STRIP_LEADING_DIRECTORY_PATH(.*)" $PREFIX_LEADING_DIRECTORY_PATH$1;
+ "~^$STRIP_LEADING_DIRECTORY_PATH(.*)" $PREFIX_LEADING_DIRECTORY_PATH$1;
default $PREFIX_LEADING_DIRECTORY_PATH$uri_full_path;
}
map $S3_STYLE $s3_host_hdr {
virtual "${S3_BUCKET_NAME}.${S3_SERVER}";
- path "${S3_SERVER}:${S3_SERVER_PORT}";
+ path "${S3_BUCKET_NAME}.${S3_SERVER}:${S3_SERVER_PORT}";
default "${S3_BUCKET_NAME}.${S3_SERVER}";
}
@@ -134,7 +134,7 @@ server {
# Enable passing of the server name through TLS Server Name Indication extension.
proxy_ssl_server_name on;
- proxy_ssl_name ${S3_SERVER};
+ proxy_ssl_name ${S3_BUCKET_NAME}.${S3_SERVER};
# Set the Authorization header to the AWS Signatures credentials
proxy_set_header Authorization $s3auth;
@@ -195,7 +195,7 @@ server {
# Enable passing of the server name through TLS Server Name Indication extension.
Which would result in the bucket name being included twice if it's added as part of the S3_SERVER
config. Just so we can disprove that potential issue and move on with the solutioning, would you mind double checking that you've not got that version running somewhere?
I ran the changes from a fresh folder only having the settings file and running docker locally. Same result unfortunately.
One problem I see is the line virtual "${S3_BUCKET_NAME}.${S3_SERVER}";
from default.conf.template
.
If we have:
S3_BUCKET_NAME
= my-bucket-name
S3_SERVER
= my-bucket-name.s3express-usw2-az1.us-west-2.amazonaws.com
Then virtual "${S3_BUCKET_NAME}.${S3_SERVER}";
would becomemy-bucket-name.my-bucket-name.s3express-usw2-az1.us-west-2.amazonaws.com
. That happens with the version of the code available in main
right now: https://github.com/nginxinc/nginx-s3-gateway/blob/7f3064b2c5cf08871164609de8d7e3f972e0a6a5/common/etc/nginx/templates/default.conf.template#L23
Can you check on your side what is the value of virtual
?
I ran the changes from a fresh folder only having the settings file and running docker locally. Same result unfortunately.
One problem I see is the line
virtual "${S3_BUCKET_NAME}.${S3_SERVER}";
fromdefault.conf.template
.If we have:
* `S3_BUCKET_NAME` = `my-bucket-name` * `S3_SERVER` = `my-bucket-name.s3express-usw2-az1.us-west-2.amazonaws.com`
Then
virtual "${S3_BUCKET_NAME}.${S3_SERVER}";
would becomemy-bucket-name.my-bucket-name.s3express-usw2-az1.us-west-2.amazonaws.com
. That happens with the version of the code available inmain
right now:Can you check on your side what is the value of
virtual
?
That's a good point. That code sets the Host
header passed along to s3 and so we should make sure that it is correct. That along with the setting of the service name used to sign the s3 token header to s3express
should be part of the changes we make if we want to do this right.
I pushed up some changes here: https://github.com/nginxinc/nginx-s3-gateway/pull/229 so that we can go back to not including the s3 bucket in the S3_SERVER
configuration variable as well as changing the service name. Note that this is not how I want to implement it finally but I wonder if you'd mind testing this?
In case it helps, here are the commands I'm using:
# From root of project
git checkout issue-225-usage-explanation
docker build --file Dockerfile.oss --tag issue-225 .
docker run --rm --env-file ./deployments/s3_express/settings.s3express.example --publish 80:80 --name issue-225 issue-225
Note that the S3_SERVER
should be s3express-usw2-az1.us-west-2.amazonaws.com
(or whatever the one is for your availability zone) and S3_BUCKET_NAME
is the complex directory bucket name:
my-bucket-name--usw2-az1--x-s3
If you receive any errors - can you let me know the exact error message? It's still strange to me that I'm seeing this work and you are not. I may have some others also test to see if I'm missing some obvious misconfiguration on my test setup.
Finally working! I can confirm that with the changes in #229 it works for me.
Also I tested with both s3
and s3express
as service and it works, so I would not really make that change.
I also verified that a regular S3 General buckets works as well with those changes. I'll leave a couple comments on the PR. Thanks for looking into this, greatly appreciated.
Finally working! I can confirm that with the changes in #229 it works for me. Also I tested with both
s3
ands3express
as service and it works, so I would not really make that change.I also verified that a regular S3 General buckets works as well with those changes. I'll leave a couple comments on the PR. Thanks for looking into this, greatly appreciated.
Thank you for taking the time to work through this with me and testing it out. I'm going to adjust my PR to productionalize this then ask you for some review. I'll respond to your feedback in the other PR
I am closing this in favor of #229 .
These changes aim to bring support for S3 Express directory buckets #225 .
Overall the required changes are the following:
s3express
instead ofs3
.default.conf
needs to use the full bucket + server url, for example:upstreams.conf
needs to use the full bucket + server url as well, for example:Be sure to specify the port in the S3_SERVER and be sure that port
corresponds to the https/http in the proxy_pass directive.
server my-directory-bucket--usw2-az5--x-s3.s3express-usw2-az5.us-west-2.amazonaws.com:443; }
This PR does not implement the new session based S3 Express auth model to speed up requests.
I am looking for some guidance on how this should be implemented. I introduced a new env var
S3_SERVICE
and based on it the code is conditionally replacingdefault.conf
andupstreams.conf
. I am not sure if this is the approach that is desired but happy to receive feedback on it.Also, when running tests by doing
./test.sh oss
I am getting2024/04/12 18:25:15 [emerg] 1#1: SyntaxError: "awscredentials" has already been declared in /etc/nginx/:3
and I am not sure how to pass that issue.