Is secret Rotation/Update managed?

cyberark / summon

CLI that provides on-demand secrets access for common DevOps tools

https://cyberark.github.io/summon

MIT License

704 stars 62 forks source link

Is secret Rotation/Update managed? #132

Closed phisco closed 4 years ago

phisco commented 4 years ago

Hello,

I didn't find any clear reference to this in the documentation and my tests did not confirm it either. So I'll try to ask it here, as it looks strange to me that it is not handling it out of the box. I'm trying to check if summon is able to update the env variables or files created from secrets stored into a DAP instance once these secrets have been updated.

Let me know if you think the question has to be moved to the specific conjur provider.

sgnn7 commented 4 years ago

Hey @phisco, I can answer your question here. Summon in its current form does not support changed variables if you use it exactly as documented due to the complexities that make it infeasible either due to UX degradation and/or suboptimal architecture so let me cover those briefly but at the bottom I will mention a different approach and also a different product that can be used to get your code working:

Problems

Coordination between summon and the launched process is minimal and we would need a lot of work on both summon and the launched process to get bilateral comm to work. Even if it did work though, you would probably be dealing with soft-reloads at a minimum.
Env vars cannot be changed dynamically for a child process without really tricky and/or privileged operations which most of our users do not want to elevate summon into.
If summon is handling resolution of variables with no communication with the child process, we would have to poll for changes which would degrade performance significantly even in minimal setups.

Workaround (`summon`-specfic)

Instead of having summon launch your process, you could have your process launch summon instead when it needs the credentials and you could capture that input. This would ensure that you have the right data at the right time without polling.

Alternative (Secretless Broker)

Due to this problem, we have an open-source product called Secretless Broker that will fix this for you (plus it has additional security benefits of keeping credentials secure from your app) if the cred data is used for a supported connection type. You can take a look at the github repository or the website to find out more about it.

Alternative #2 (code contrib)

We do support addition to the codebases so if you feel like something is missing contributions are welcome and we can also try putting it on the internal roadmap too if it seems valuable enough to the wider community*.

*- and realistically we also would have to find some graceful architectural solutions to the problems listed previously

I will close this issue for now but feel free to leave replies to my comment and we can figure out if we can find a way to maybe get your use case working in some way.

phisco commented 4 years ago

Thanks @sgnn7 for your exhaustive reply, really appreciate!

Few feedbacks:

soft restarts (SIGHUP signal sent to the children process) would have been acceptable in my opinion.
I didn't think it really through, but yes, you are definitely right, injecting variables in the running process is not an option. But what about files? Quickly going through the code I did find how env vars are handled, but I wasn't able to find out how files (!file annotated secrets) are handled. I'll definitely try to give it a look in the following days.
the workaround you've proposed in order to still use summon could be an option, even if I would have preferred something able to handle expiration/rotation more transparently. I'd probably choose to use the provided Client Libraries or directly the REST APIs in that case.
I've already evaluated Secretless and I love the approach, but does it support JDBC (mostly interested in oracle)? Few months ago I was told it didn't, but now I see some references in the latest releases to JDBC connectors being tested (Postgres and MSSQL). I definitely have to dig deeper on that too.
I would definitely love to contribute. Clearly, as you said, the hard part is to define a good architectural solution.

sgnn7 commented 4 years ago

soft restarts (SIGHUP signal sent to the children process) would have been acceptable in my opinion.

Interesting point of view that I don't think we thought about much. That part could be doable though we would still be stuck polling I think.

Quickly going through the code I did find how env vars are handled, but I wasn't able to find out how files (!file annotated secrets) are handled.

They're also only fetched once. Getting an auto-update on these would be easy on the surface but there are a ton of edge cases (vs a soft-reload) that it would introduce (eg. summon re-writing the file while the client is reading it). The code in question is here.

the workaround you've proposed in order to still use summon could be an option, even if I would have preferred something able to handle expiration/rotation more transparently. I'd probably choose to use the provided Client Libraries or directly the REST APIs in that case.

Yeah for more advanced usages, APIs are the way to go. Our Python3 API has some self-extracting binaries as well if summon is to bulky to use though it's alpha-level support right now.

I've already evaluated Secretless and I love the approach, but does it support JDBC (mostly interested in oracle)?

The JDBC part should work fine it just matters if Secretless supports the target protocol (and in this case it would be Oracle). For Oracle, we have had positive strides there but due to some legalities it's really hard to get that in the open without explicit help from corp in question so for right now it's in limbo but we're trying to do what we can. 🤷‍♂

I would definitely love to contribute. Clearly, as you said, the hard part is to define a good architectural solution.

Well, the SIGHUP approach seems fine to me as a configurable option with a settable polling time so let me know if maybe an acceptance criteria like this sounds good:

summon has configurable flag to enable (disabled by default) flag to poll secrets
summon has a configurable flag to set the interval for polling (1 min as the default seems ok to me)
summon restarts the app on SIGHUP and provides it with the new variables
tests are available for all new functionality
very important: This would require summon to store some representation of credentials to be able to tell if the credential actually changed. This must not be plaintext but could probably be a strong hash (SHA265 or similar) of the "password + static salt (maybe version number?) + dynamic salt (eg. some crypto rand string)" and should optionally be kept in secured storage of some sort.

phisco commented 4 years ago

They're also only fetched once. Getting an auto-update on these would be easy on the surface but there are a ton of edge cases (vs a soft-reload) that it would introduce (eg. summon re-writing the file while the client is reading it). The code in question is here.

You are right, files too could be tricky to be substituted for a running process, I have to dig a little more on how it is implemented in kubernetes/docker, I guess it's using some filesystems/namespaces feature. Just to be clear, I was proposing a slightly different approach from the one you're proposed: once a file is updated summon should send a SIGHUP to the child process so that it can reload its configs (let's call it a softer restart). This is similar to how Secrets/ConfigMaps mounted as volumes are handled in Kubernetes. But I think in this specific case, being single files mounted and not entire volumes (single files are not updated in Kubernetes neither). Surely the issue is the atomicity of the update, what about using symlinks? That's how they do upgrades in Nix, here.

The JDBC part should work fine it just matters if Secretless supports the target protocol (and in this case it would be Oracle). For Oracle, we have had positive strides there but due to some legalities it's really hard to get that in the open without explicit help from corp in question so for right now it's in limbo but we're trying to do what we can. 🤷‍♂

I imagined something similar 😢

Well, the SIGHUP approach seems fine to me as a configurable option with a settable polling time so let me know if maybe an acceptance criteria like this sounds good:

* `summon` has configurable flag to enable (disabled by default) flag to poll secrets

* `summon` has a configurable flag to set the interval for polling (1 min as the default seems ok to me)

* `summon` restarts the app on SIGHUP and provides it with the new variables

* tests are available for all new functionality

* **very important**: This would require `summon` to store some representation of credentials to be able to tell if the credential actually changed. This _must not_ be plaintext but could probably be a strong hash (SHA265 or similar) of the "password + static salt (maybe version number?) + dynamic salt (eg. some crypto rand string)" and should optionally be kept in secured storage of some sort.

That's a tricky point, you are right. I think this should be tackled on two sides, summon should give providers the ability to "trigger" reloads / poll secrets and then for the specific providers we should think of a way to implement it. Is Conjur providing some kind of data w.r.t. the "freshness" of the secret?

sgnn7 commented 4 years ago

I was proposing a slightly different approach from the one you're proposed: once a file is updated summon should send a SIGHUP to the child process so that it can reload its configs (let's call it a softer restart). ... Surely the issue is the atomicity of the update, what about using symlinks? That's how they do upgrades in Nix, here.

This should work as long as we have the expectation that the secrets won't be read outside of child process startup which might be fine. The symlink approach sounds like a good option though (and as much as I hate to bring it up since that platform is almost never used) Windows will not be able to operate using this system so we will need to maintain/test separate logic paths for that.

Is Conjur providing some kind of data w.r.t. the "freshness" of the secret?

Hmm... excellent question - I know that Conjur has a notion of "versions" but I don't know if they monotonically increase or just point to last 20 versions (with ver 1 being latest and 20 being always farthest). Maybe @micahlee might know? If Conjur can provide this for us, then we can definitely not need to worry about storing some representation of the secret inside of summon.

Re: Oracle: I imagined something similar 😢

Just saying that if there's a repo with a Secretless plugin of someone implementing this use case that you can build and include at runtime with the broker's plugin system, that would get your use case to work too :)

PS: We appreciate your input on this very much and if you need any assistance in contributing, please let me know!

izgeri commented 4 years ago

Hey all :) this is a great conversation! I'm just jumping in to add a note re: @sgnn7's point that

very important: This would require summon to store some representation of credentials to be able to tell if the credential actually changed. This must not be plaintext but could probably be a strong hash (SHA265 or similar) of the "password + static salt (maybe version number?) + dynamic salt (eg. some crypto rand string)" and should optionally be kept in secured storage of some sort.

I would prefer to stay very far away from summon having to keep some secret vals in memory. we could set it up to function more like secretless instead, though, and have it just retrieve new values on the refresh interval regardless - then it wouldn't need to know the values change, it would just always get new values.

though at that point I am again drawn to the idea of extracting the summon credential provider logic from secretless to build summon2 and release that instead :)

sgnn7 commented 4 years ago

I would prefer to stay very far away from summon having to keep some secret vals in memory. we could set it up to function more like secretless instead, though, and have it just retrieve new values on the refresh interval regardless - then it wouldn't need to know the values change, it would just always get new values.

I think that this would be unfeasible as a solution to the original problem since summon would then not know when to restart the child process to notify it of changes and we wouldn't want to do it on every refresh cycle. Can you elaborate your proposal?

though at that point I am again drawn to the idea of extracting the summon credential provider logic from secretless to build summon2 and release that instead :)

Sadly, I don't think that the original issue of reloading would be solved with summon2 since the issue at hand would stay the same ("when and how do we notify the child process of changes").

phisco commented 4 years ago

I would prefer to stay very far away from summon having to keep some secret vals in memory. we could set it up to function more like secretless instead, though, and have it just retrieve new values on the refresh interval regardless - then it wouldn't need to know the values change, it would just always get new values.

Totally agree we should avoid keeping secrets in memory or at least not compare secrets values directly, but in this case some sort of timestamp / hash / monotonical id would have to be provided by conjur on each poll. If conjur only provides the kind of versioning @sgnn7 was mentioning, it would be really hard not to store some kind of hash. However, isn't summon already storing all the retrieved data in the runSubCommand env parameter? Am I missing something? Is some sort of memory lock (IPC_LOCK) adopted?

I think that this would be unfeasible as a solution to the original problem since summon would then not know when to restart the child process to notify it of changes and we wouldn't want to do it on every refresh cycle. Can you elaborate your proposal?

Agree that reloading configs (sending a signal or restarting the child process) should definitely be considered as an expensive operation and therefore should be minimized as much as possible.

So, to recap:

Polling is fine, as secretless is already doing it, correct? Has to be checked whether conjur is providing some timestamp/id to verify whether secrets have been updated or not.
Soft reloading through a SIGHUP signal sent to a child process could be acceptable, but would work only with files and not env variables. Symlinks could be an option worth exploring to have atomic updates on files, Windows could be supported as Go's os package offers symlinking also for it, but I'm not sure updates would still be atomic. But out of curiosity is Windows a real use case for summon? I'm coming from a mostly unix world, I'm just trying to understand the user base of the tool.
Env vars would require the child process to be restarted. In Kubernetes this would be an issue, because liveness/readiness probes would start to fail and depending on how they are configured and how long the process boot takes this could mean the whole container gets restarted. Which could be an issue. For this reason I think that not providing updates at all to secrets passed as env variables would be acceptable. As said this is the same behavior of Kubernetes' ConfigMaps/Secrets.
Secretless could be a better option as long as more protocols are officially supported, e.g. Oracle is mandatory for the kind of enterprise customers I work with.

sgnn7 commented 4 years ago

However, isn't summon already storing all the retrieved data in the runSubCommand env parameter?

Only if you don't use @SUMMONENVFILE. If you don't use that, I guess that's a good point but it is localized to a very small section of code. I'll file this as a new issue since we could zeroize those values right after we start the process either way regardless of what we decide here.

Polling is fine, as secretless is already doing it, correct?

Minor clarification here: secretless doesn't poll - it doesn't store any secrets and it fetches new provider values on new connection requests so it's an on-demand implicit refresh. Usually apps only need to do this once and they reuse the same DB connection.

But out of curiosity is Windows a real use case for summon?

Yes - there's a large user base working on Azure and Windows :(

In Kubernetes this would be an issue, because liveness/readiness probes would start to fail and depending on how they are configured and how long the process boot takes this could mean the whole container gets restarted.

What if the liveliness check is served up by summon?

Soft reloading through a SIGHUP signal sent to a child process could be acceptable, but would work only with files and not env variables.

Just a sidenote that I think env vars could work if we do soft/hard-exit (SIGKILL/SIGTERM) rather than soft-reload that leads to child process exit - it all depends on how forceful we are being with letting the child know that the vars have changed :)

Secretless could be a better option as long as more protocols are officially supported, e.g. Oracle is mandatory for the kind of enterprise customers I work with.

100% with you there!

phisco commented 4 years ago

Only if you don't use @SUMMONENVFILE. If you don't use that, I guess that's a good point but it is localized to a very small section of code. I'll file this as a new issue since we could zeroize those values right after we start the process either way regardless of what we decide here.

I see, found some more info on how @SUMMONENVFILE is handled on other issues, e.g. . Perfect, thank you.

Minor clarification here: secretless doesn't poll - it doesn't store any secrets and it fetches new provider values on new connection requests so it's an on-demand implicit refresh. Usually apps only need to do this once and they reuse the same DB connection.

Thanks for the clarification, did not yet have the time to dig in secretless' code. That makes totally sense, but obviously is not applicable for summon and I see no other option than to poll.

What if the liveliness check is served up by summon?

That's an option. Probes can be an http endpoint, a socket opened or a command. Summon should be able to just passthrough the request (http/socket) when the process is normally running and fake it while the process is being restarted. Obviously that's a problem only for applications starting in minutes (java usually...), not for applications starting in seconds.

Just a sidenote that I think env vars could work if we do soft/hard-exit (SIGKILL/SIGTERM) rather than soft-reload that leads to child process exit - it all depends on how forceful we are being with letting the child know that the vars have changed :)

Once the polling part is set up, adding a flag to allow also SIGKILL/SIGTERM to be sent to the child process and then restart it should not be an issue! 😉

izgeri commented 4 years ago

@sgnn7 since this issue is still in active discussion, wdyt about reopening it to encourage further comments and remove the triage/wont-fix label for now?

sgnn7 commented 4 years ago

@izgeri I think it's not really a "issue" as it stands and the thread is pretty polluted with various dead ends that we reached so maybe we can open up a new one that's referring to the possible directions that we reached from here? Something like a new issue (Summon can notify child process of secrets changes) with the cherry-picked acceptance criteria from this thread I think would be good. WDYT?

phisco commented 4 years ago

I agree on opening a new Issue. Just to let you know I will most probably try to file a PR this weekend regarding this.

sgnn7 commented 4 years ago

Created the new issue in https://github.com/cyberark/summon/issues/137.

cyberark / summon

Is secret Rotation/Update managed? #132

Problems

Workaround (summon-specfic)

Alternative (Secretless Broker)

Alternative #2 (code contrib)

Workaround (`summon`-specfic)