Closed sumitag closed 2 years ago
Would this be the aws_session_token
token that is often used with SSO on AWS accounts? Can you provide some more details about what type of temporary credential you are thinking of?
A very important enhancement request actually. It is considered to be very insecure to save a permanent AWS_ACCESS_KEY and AWS_SECRET_KEY somewhere. It's even forbidden in a lot of enterprise environments. Instead roles should be used to allow AWS components to access services they need.
So let's assume this nginx-s3-gateway would run on a AWS EC2 instance, with the right role applied to the instances it would be allowed to access the S3 bucket with temporary keys which can be retrieved from the EC2 metadata, as described here: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2.html
That would be a huge security improvement for the nginx-s3-gateway, so I clearly vote up for this, new feature request.
"Retrieve security credentials from EC2 instance metadata or ECS task metadata endpoint" ?
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#instance-metadata-security-credentials https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-metadata-endpoint.html
"Retrieve security credentials from EC2 instance metadata or ECS task metadata endpoint" ?
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#instance-metadata-security-credentials https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-metadata-endpoint.html
Depending on how you are running it, if nginx is running on ec2 then instance if ecs then task.
Hi, @dekobon , are there any news or plans about that?
This feature would also be hugely beneficial to my team.
I would like some help from the community in designing this feature. I've got the following questions:
Do the security credentials need to be refreshed at a periodic interval? If so, it will be difficult to implement the feature purely within nginx.
I can imagine a few implementations:
Does anyone have other ideas about how to implement this?
- Maybe, this is a big maybe, a subrequest could be done when a njs variable containing a timestamp is exceeded which then resets to security credentials. This would result in increased latency for the request that would trigger the credentials update.
Appreciate your response! I had looked into implementing the IAM role feature, and this is the path that I had initially considered. Though the other approaches seem interesting as well.
I'll highlight some of the information that I've found in my research so far.
curl 169.254.170.2$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
(link)TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600") && curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data/iam/security-credentials/s3access
(link)Token
is issued in addition to an AccessKeyId
and SecretAccessKey
. The Token
must be included in all API requests that use temporary credentials (this differs from the long-lived credentials, which don't have a Token
to be included in API requests)Hopefully this is helpful. I'll continue to explore this on my own as time permits and I'm happy to help do any research that might get this feature implemented quickly!
EC2 requires two HTTP requests to retrieve credentials: TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600") && curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data/iam/security-credentials/s3access (link)
I don't know if this is accurate. If you click on the IMDSv1
tab from that page, it shows a single HTTP request that looks like it can be used to get the temporary credentials for EC2.
I'm a complete newbie to how this project works :) However, I like it 👍 I just wanted to answer the same information as @ajschmidt8 posted already, so I will just extend on that :P
The expiration of credentials for a role of an ECS task running on the fargate compute type is, by default and currently, 6 hours, iirc. That means, we could hold a copy of these credentials for quite a long time actually. The response from the endpoints mentioned by @ajschmidt8 are also containing the date and time when these credentials will expire. At least for EC2 (I assume it's the same for ECS), credentials are automatically renewed by the endpoints at least 5 minutes before the current ones expire.
While looking at the Compatibility of the njs module, I came to a quick idea of how this could be implemented:
setTimeout
helper to "schedule" a refresh of the credentials (by doing the same fetch-procedure again) 1-5 minutes before the credentials expireBy doing so, nginx fetches the credentials asynchronously to actual requests (except for the first one, but even that could be worked around by adding a refreshCredentials
or so method which is called when the server starts?). Preventing any overhead to the webrequests by the credential flow.
What do you think? Would that be possible, or did I miss an important details? :)
EC2 requires two HTTP requests to retrieve credentials: TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600") && curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data/iam/security-credentials/s3access (link)
I don't know if this is accurate. If you click on the
IMDSv1
tab from that page, it shows a single HTTP request that looks like it can be used to get the temporary credentials for EC2.
While IMDSv1 is still possible to use, one can disable v1 on a per-instance configuration. Also, v2 protects against some possible attacks which were possible with v1 (https://aws.amazon.com/blogs/security/defense-in-depth-open-firewalls-reverse-proxies-ssrf-vulnerabilities-ec2-instance-metadata-service/). Because of that, implementing a new feature, which requires v1, might be difficult to justify. There might even be environments, where the v1 must be disabled and not be used for security or compliance reasons.
I also think that, with an async approach, using v1 doesn't really give a big benefit anymore (compared to on a per-request basis).
I spiked something together :D It's not nearly in a state to be in a PR but maybe it is already enough to get a discussion started about my proposed solution: https://github.com/nginxinc/nginx-s3-gateway/compare/master...FlorianSW:issue/8?expand=1 (I hope you can see this diff)
setTimeout
should be easy enough, once we know the current logic works, kind of.I'll explore a bit more in the next days or weeks whenever I find a bit of time. However, any feedback is welcome :) (especially int he njs area, as this is completely new to me).
i used my changes I mentioned yesterday and deployed them to ECS. A bit as expected it did not work. After trying to figure out what is going on, I think I now face two fundamental problems with njs (which are by design as far as I know), which, until now, I did not find any solution for (nor a good one):
njs starts a new VM for the scripting for each request (? at least that's how I understood it so far). This brings the issue, that there is currently no way to share variables or similar things in-memory with subsequent requests (not _sub_requests). Given there are two distinct requests, which need to be fetched from S3, there is currently no way for request B to know if (and which) credentials were already fetched by request A, nor a way to share these with them. I first tried to resolve this by storing the credentials in a variable (writable var with js_var
), but it seems these variables are not shared between requests, either. I think, this issue can only be resolved in two ways:
the credential fetch process is asynchronous, however, adding the signature to the Authorization header (via js_set
) is a synchronous action. In my current code, I use js_set
as well for the credentials fetch trigger, which works great, however, the call to the s3auth method does not wait for the async function to fetch credentials to finish, hence erroring right now. That might be an even bigger blocker, as I do not have any idea for now how to handle that.
last but not least: I didn't find a way with njs to execute code when the config is loaded (something like "on startup of nginx") to initially fetch the credentials. That would, for now, mean, that the first request would need to wait for the credentials to be fetched, which may increase the response of that request slightly. Not sure, how to solve this.
Still didn't look into refreshing the credentials (the issue might also be: setTimeout would not have a reference to a request, hence no way to update the credentials variable, even if it would be a keyval (in order to resolve the first issue)). Also: no idea as of now.
@FlorianSW, just wanted to say that I appreciate the effort here! Hopefully you and @dekobon can come up with something that works.
Hi @FlorianSW,
njs starts a new VM for the scripting for each request (? at least that's how I understood it so far)
Yes, you are right. Currently, all requests are totally independent by design. The only way to communicate is only through keyval or any other (potential) shared memory module.
the credential fetch process is asynchronous, however, adding the signature to the Authorization header (via js_set) is a synchronous action.
Take a look at the following example.
maybe I just did not found another solution. It does not really need to be a solution to "sync" these values between workers, either. Only between requests would be totally ok.
The quick, but not so elegant solution would be implementing setenv
for setting environment variables per process (See process.env ).
last but not least: I didn't find a way with njs to execute code when the config is loaded (something like "on startup of nginx") to initially fetch the credentials
not implemented yet.
Still didn't look into refreshing the credentials (the issue might also be: setTimeout would not have a reference to a request, hence no way to update the credentials variable, even if it would be a keyval (in order to resolve the first issue)). Also: no idea as of now.
One way to do this is to refresh credentials once they are detected to be stale.
njs starts a new VM for the scripting for each request (? at least that's how I understood it so far)
Yes, you are right. Currently, all requests are totally independent by design. The only way to communicate is only through keyval or any other (potential) shared memory module.
Yeah, I kind of expected such an answer :/
Take a look at the following example.
I actually do the example of "generating a JWT". Means:
r.variables.credentials
variable (which is connected via js_set
to an async handler)r.variables.credentials
variable) does not hold on that access of the variable and is further processed while the async handler is still fetching the credentials. This basically results in an error in JSON.parse()
(not actually sure, if undefined is passed in in that case or whatever value)Based on that it seems I'm doing something wrong 😅
The quick, but not so elegant solution would be implementing setenv for setting environment variables per process (See process.env ).
I thought about that already as well. Basically, we "just" need to save three values (they even do not need to be persistent) somewhere. The environment was something I was thinking about as well, as the values would reside there anyway, if one does not use the instance credentials. If I see it correctly (and reading that into your "would be implementing setenv
"), such a functionality does not yet exists, am I right?
What would be your thoughts about such an implementation? I mean, one could then change the environment of the nginx process (or at least the worker process) itself, wouldn't it? Any other implementation wouldn't make much sense, and, at least for this use case, wouldn't work either.
Giving this a bit more thought, I wouldn't call it "not so elegant" anymore to be honest. Like I said, our requirements for persistency and even synchronicity between worker processes are really low (basically: no need for any of that), as these credentials can simply be retrieved again when they are not present (anymore).
not implemented yet.
Is there any plan on having something like that? I mean, right now, it would not be a big issue, if that is not present as a feature (given we can work around the other things, the first request might get impacted in performance, however, that should be ok every 6 hours or so).
One way to do this is to refresh credentials ones they are detected to be stale.
But that would be in the scope of an actual request, right? Meaning, if a request is served and the credentials are found to be expired, they would be refreshed before the request can be served, right?
@FlorianSW
However, the s3auth code (accessing the r.variables.credentials variable) does not hold on that access of the variable and is further processed while the async handler is still fetching the credentials.
Unfortunately, js_set
does not support truly async methods (like, setTimeout, ngx.fetch(), r.subrequest()) due to nginx variable interface limitations.
The limitation is that nginx is not able to wait for a variable evaluation asynchronously. (WebCrypto async methods are working right now, because they can be executed synchronously in a JS call.)
But, as I mentioned above, the js_set
limitation can be circumvented using auth_request.
Writing the credentials to a temp file (meh... and... meh;
BTW, this is not as bad as it may sound. nginx is reading from FS all the time. In most cases, frequently accessed files are always in a file memory cache, so reading such a file is usually fast. (The FS solution is definitely suboptimal when the file content is changed often and not in atomic fashion, but the credential example is different).
If I see it correctly (and reading that into your "would be implementing setenv"), such a functionality does not yet exists, am I right?
yes, setenv is not implemented.
What would be your thoughts about such an implementation? I mean, one could then change the environment of the nginx process (or at least the worker process) itself, wouldn't it?
Yes, this is will be global (process level) change, visible to other requests.
I think this setenv()
makes sense for use cases similar to your.
I think we may add these feature for the next release.
But that would be in the scope of an actual request, right? Meaning, if a request is served and the credentials are found to be expired, they would be refreshed before the request can be served, right?
I remembered the good (nginx) way to handle expired data (like credentials).
Use r.subrequest()
or auth_request
to a proxy_pass
location with enabled cache. When cache is set up correctly, most of the time the data will be served from the cache and as soon as the data expires it will be refetched by proxy_pass automatically.
@FlorianSW
Unfortunately, js_set does not support truly async methods (like, setTimeout, ngx.fetch(), r.subrequest()) due to nginx variable interface limitations.
just to clarify. As of now, unlike js_set
, the truly asynchronous context is js_content
.
Alright, I played around a bit in the last days with what inputs @xeioex gave in the recent comments, and it seems I have a working proof of concept, at least :) For the setup:
The current flow looks like this:
auth_request
is used to call an internal locationjs_content
to trigger an async njs function that basically fetches credentials by one of these methods (based on env variables)
tmp
directoryAuthorization
header is added with the already known s3auth
function
Caveats:
I'll update the code in my branch in the evening, if I do not forget it 🙈
I just made the last changes I wanted to make, which now also includes refreshing credentials, when the cached ones reach their expiration time. This should make sure, that always fresh credentials are used to access the respective S3 bucket. As the default expiration in ECS is 6 hours, I wasn't able to test out the refresh directly, however, it's the same logic as if no credentials are there. Assuming that the Date
object works in njs, and that AWS returns the true expiration (which they do :P), I don't see any reason why this should not work.
As I'm more or less happy with my code, I opened a PR (#18) for all of you to review :)
There is also the open points that needs testing:
After all: Every review, suggestion and feedback is welcome :)
This is huge! Thank you for all of your hard work on this @FlorianSW. I will try to find some time this weekend to test this out.
I'd like to move discussion regarding instance profile support to github issues rather than the PR because there is more to discuss than what can be contained in a single PR.
First off, I've made a number of changes in the branch version.
test.sh
script now supports two new parameters latest-njs-oss
and latest-njs-plus
. For now, we will need to use a latest-njs-*
version in order to get the instance profile support to work. When a new release version of njs lands, we will upgrade to it.Next, I see that we still need to do the following before integrating:
Add a instance profile configuration guide. There is a lot of useful information about configuring your instance profile to work with EC2, EC2/Docker, ECS, and Fargate. We may also want to add sections for instance profile support on other clouds with S3 compatibility.
Any and all help is welcome!
Fixed in 0df77d631ea6873468d9c862187d7d524db0b7b4
Is it possible to use temporary credential retrieved from instance profile/role?