kubernetes-sigs / secrets-store-csi-driver

Secrets Store CSI driver for Kubernetes secrets - Integrates secrets stores with Kubernetes via a CSI volume.
https://secrets-store-csi-driver.sigs.k8s.io/
Apache License 2.0
1.21k stars 280 forks source link

Don't rely on pod creation to mount to Kubernetes secrets #298

Open djj0809 opened 3 years ago

djj0809 commented 3 years ago

Describe the solution you'd like

Currently, when you choose to sync to Kubernetes secrets, pulling secrets/certs/keys from Key vault always happen on pod creation and you also need to mount SecretProviderClass to specific pod to make it work.

If we have issues with Key Vault access (managed identity is not working, things are deleted accidently in Key Vault or Key Vault API is unavailable), things will start to fail when pod gets restart/recreated. And it's even worse for cronjobs, it will try to retrieve secrets/certs/keys from Key Vault every time a new pod is created under this cronjob, but as Key Vault is down, you can no longer create ANY new pod in production.

I hope this library can simply sync secrets/certs/keys to Kubernetes secrets at deploy time and don't need to rely on any pod to mount them, thus I don't have to worry about Key Vault availability after deployment.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

aramase commented 3 years ago

@djj0809 Thank you for opening the issue! The scope of this project is to primarily mount the external secrets-store contents in the pod using the Kubernetes CSI standard. The sync as K8s secret is an optional feature to mirror those contents as Kubernetes secrets in addition to the mount. I think skipping the pod mount is a diversion from the actual scope of this driver implementation.

tdihp commented 3 years ago

@aramase one potential application of such use case is configuring ingress resource. Any thoughts on that?

aramase commented 3 years ago

@aramase one potential application of such use case is configuring ingress resource. Any thoughts on that?

@tdihp That makes sense. The consumption with ingress resource was one of the driving factors for optionally supporting mirroring the mounted content as Kubernetes Secrets. However if we skip the mount completely and only created Kubernetes Secret then that would be a diversion from the actual scope of the driver project. The CSI driver uses the CSI implementation as the core to mount secret contents for safe consumption within pods.

jemag commented 3 years ago

I think this is a reasonable suggestion given the fact that the main use case for the secrets-store-csi-driver is the integration with some kind of vault.

There are a lot of resources that rely on native kubernetes secrets. Let's say that I have 100+ resources depending on a native kubernetes secret. Right now, that would mean that I need to create a SecretProviderClass, then add relatively useless volumeMounts and volumes for those 100+ resources (useless in the sense that the resources will not use the mount itself but the reference to the kubernetes secret). If I do not add that huge amount of boiler plate code for some resources, there could be a risk where no previous resources tried to mount it and the secret itself is unavailable.

If that feature were to be implemented, I would instead only need to create the SecretProviderClass, and reference the kubernetes secret in the resources that need it. This would mean a lot less overhead and cleaner yaml.

I think this could be a big improvement for all the vault integrations that depend on this project, unless there is already some way to get around this with the current setup that I missed.

ghost commented 3 years ago

I agree with jemag on this -- not being able to sync any of the secrets from a vault into a secret means effectively having to manage secrets in two ways for most systems, because pretty much every Helm chart and 3rd party component manages its pods via an Operator or other mechanism, and you can't control how they access secrets. They just expect them to be present.

That leaves administrators having to choose between not using a vault implementation, managing secrets two ways, or using a hacky process of priming the system periodically with a pod to get things synced as a secret. We settled on the latter for now, but its definitely tricky to manage -- the CSI configuration and that pod configuration have to be kept in sync, and if someone misses something, its not always obvious.

Given that the CSI mechanism wakes up when a SecretProviderClass is created or updated, and it knows what secrets are being requested to be synced, it seems like its the best place to create a transient pod to trigger the synchronization.

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

aramase commented 3 years ago

/remove-lifecycle stale

sbose78 commented 3 years ago

+1 on this, will provide a detailed use case shortly.

cbroglie commented 3 years ago

Another use case is image pull secrets; there is a chicken and egg problem if creating the secret to pull the pod's image requires the pod to be running.

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

aramase commented 2 years ago

/remove-lifecycle stale

ddskit commented 2 years ago

Another use case is where we are trying to create a PV and PVC using azure flexvolume blobfuse driver which needs a secret (type: azure/blobfuse) to be available in cluster to access the storage account. This secret need to be created by picking up the storage account access key from key vault but there is no need to mount this secret in pod.

Note: There is another issue that secrets-store-csi-driver is not supporting the "azure/blobfuse" type of secret creation.

aaaaahaaaaa commented 2 years ago

Another quick feedback: I also find very cumbersome the necessity to start a pod in order for a secret to be created. I would say even counter intuitive when looking at this project for the first time, although I understand the initial scope of the driver was different.

As a user, I was hoping for a solution that easily bridges GCP Secret Manager and Kubernetes secrets, without affecting my existing workloads definitions.

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

nilekhc commented 2 years ago

/remove-lifecycle stale

nerddtvg commented 2 years ago

I opened a related issue to this on the Azure provider driver. We're not seeing the same behavior where cronjobs pods check for new values before each run. They're still using the cached values because the secrets still exist (the completed pods are not cleaned up yet).

But because the pods are completed/succeeded, the rotation interval does not update the values. In the end, this means jobs start failing because the secret values are wrong.

In the end, I think these lines need to be changed to rotate secrets for completed or failed jobs. It's possible secrets update between runs, the pods still exist, and new mounts will use new values. And it's possible the jobs/pods fail because of bad secret values.

Maybe changing this behavior will also solve the initial ask from this. The secrets won't be made on definition, still on pod creation, but they will be kept up to date for future runs. Then if the vault access fails, the most recent secret should still be okay to go.

https://github.com/kubernetes-sigs/secrets-store-csi-driver/blob/0f4ff7d43e591dbc7d3b87ed1d7602a5f15f1059/pkg/rotation/reconciler.go#L267-L270

I'm tagging @aramase because he made the initial commit to exclude those from rotation.

nerddtvg commented 2 years ago

@djj0809: From the above research, it is possible you're pruning out your completed/failed pods? When those are removed, the secrets are removed prompting for refresh on the next run. Check the successfulJobsHistoryLimit and failedJobsHistoryLimit values.

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

aramase commented 2 years ago

/remove-lifecycle stale

Shaked commented 2 years ago

Any update regarding this issue?

leelakrishnachava commented 2 years ago

We are also looking for a solution to this issue. especially agreeing with @jemag

r0bj commented 1 year ago

I think for some use cases described in the comments External Secrets Operator (https://external-secrets.io) could be a reasonable alternative. AFAIK it can just sync external secret with k8s native secret.

gobins commented 1 year ago

Although External Secrets Operator can solve this particular problem, would be nice if we could avoid having two controllers solving similar problems.

asitha99x commented 1 year ago

another alternative, which i believe most of you have seen https://akv2k8s.io/ ( https://github.com/SparebankenVest/azure-key-vault-to-kubernetes )

amccague commented 1 year ago

There's another use-case in knative and likely other similar tools that do not support custom volume types https://github.com/knative/serving/issues/11069 Syncing the secret only then referencing it 'natively' would enable any use case.

amccague commented 1 year ago

Are there any potential issues with using a gcr.io/google_containers/pause pod to define the mount then have other pods reference the secret?

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

pitinga commented 1 year ago

/remove-lifecycle stale

aramase commented 1 year ago

/lifecycle frozen

Marking this as frozen so it doesn't go to stale. We're exploring the option of decoupling the sync as k8s secret feature from the CSI driver mount.

Shuanglu commented 8 months ago

@aramase may I know if there is any progress regarding this? Thanks🙇

==================

Found some update from the meeting notes. If there is ETA about the feature, it would be great

DugeraProve commented 8 months ago

+1 for an update on this @aramase. This feature would be very important to us in our use of the CSI driver. Thanks

DavidLeonardoGutierrez commented 7 months ago

+1 for this feature, there's seems to be enough scenarios/needs to support the implementation. It'd be definitely a nice improvement for this library.

le0nard01 commented 6 months ago

+1 for this feature.

Userj7 commented 6 months ago

+1 for this feature

mischavandenburg commented 6 months ago

+1 for this feature

timonmasberg commented 5 months ago

+1

Dazing commented 5 months ago

+1

bchkhaidze commented 5 months ago

+1

CrisNevares commented 5 months ago

+1

akakira commented 4 months ago

+1000%

marcmognol commented 4 months ago

+1 !

pipiobjo commented 4 months ago

+1

NarenderUppala commented 4 months ago

+1

dhade33 commented 3 months ago

+1

david1188 commented 3 months ago

+1

mluker commented 3 months ago

+1

ennnas commented 3 months ago

+1

BlauerPulli commented 2 months ago

+1

stzov commented 2 months ago

@aramase could there be a compromise like adding a switch to allow secrets not to be deleted after the pod is removed?

I mean currently, when you create a Pod you have the option to create a Secret with the secretObjects configuration, but when that Pod is deleted the Secret is deleted as well. But I guess you could add a switch not to delete created Secrets. That I believe is a good compromise that will allow anybody with a divergent use case to be happy.

And there are others that provide such an option. For example, cert-manager provides you a configuration, at a controller level, that either deletes a Secret when a Certificate is deleted or leaves the Secret behind as an orphan after the Certificate is deleted.

Would that work/be in scope for your driver?

tw-sematell commented 1 month ago

bump