SparebankenVest / azure-key-vault-to-kubernetes

Azure Key Vault to Kubernetes (akv2k8s for short) makes it simple and secure to use Azure Key Vault secrets, keys and certificates in Kubernetes.
https://akv2k8s.io
Apache License 2.0
436 stars 97 forks source link

[BUG] old akv2k8s secrets left laying around #496

Open tspearconquest opened 1 year ago

tspearconquest commented 1 year ago

Note: Make sure to check out known issues (https://akv2k8s.io/troubleshooting/known-issues/) before submitting

Components and versions Select which component(s) the bug relates to with [X].

Describe the bug When creating jobs in a cluster running akv2k8s, a TLS secret is created in various namespaces. When a service such as gitlab's runners are running, it creates a temporary Pod resource without an associated ReplicaSet, Deployment, or Job. If AKV2K8S environment injection is enabled on the namespace where these pods are deployed, it causes a secret for each pod to be created.

This can be seen with the below output I took from a test cluster with the gitlab runners:

kubectl get secret -n gitlab-runner |grep -E '^akv2k8s-runner-.*-project-[0-9]+-concurrent-[0-9][ ]*kubernetes[.]io[/]tls' |wc -l
     766

It's important to note that these temporary Pods don't leverage any secrets from akv2k8s, and they don't get mutated by the webhook to include the initContainer; only pods tied to the gitlab runner deployment actually get the initContainer and use the secret.

Edit: Further inspection of our test cluster reveals that the secrets are created for any Pod or Pod-like resource created in a namespace where the `azure-key-vault-env-injection: enabled" label is present.

To Reproduce Install gitlab runner to a namespace with akv2k8s environment injection enabled and setup the gitlab runner deployment to use a secret from akv2k8s. Run some jobs, then check the output of kubectl get secret -n <gitlab-runner-namespace>

Expected behavior Secrets for resources (pods) that don't get mutated should not be created. Secrets for resources (pods) that get mutated should be removed when the resource (pod) is removed from the API server.

Additional context Add any other context about the problem here.

tspearconquest commented 1 year ago

I ran into a different issue today around some old jobs that we had run a while back and secrets which were owned by the jobs. Found out that deleting the jobs gets rid of the associated akv2k8s secrets, because the secrets have an ownerReferences block. Checking some of these pod secrets, I notice that there is no ownerReferences block, and so I'm wondering if this is something that even can be fixed?

Since these pods are getting created by another pod, and are not associated with a deployment or replicaset, I'd think akv2k8s can set the ownerReferences on the secret to associate the secret with the pod.

tspearconquest commented 1 year ago

https://github.com/SparebankenVest/azure-key-vault-to-kubernetes/blob/45f92f64dfba2708f4ebc557e1004487c58d4bb7/cmd/azure-keyvault-secrets-webhook/auth/auth.go#L260 is where the ownerReferences is set. pod.GetOwnerReferences() would of course return nothing since these pods have no ownerReferences themselves.

tspearconquest commented 1 year ago

I'm thinking something like this should work:

  ownerReferences:
  - apiVersion: v1
    blockOwnerDeletion: false
    controller: false
    kind: Pod
    name: podname
tspearconquest commented 1 year ago

From looking at a secret created by the gitlab runner deployment with ownerReferences pointed at one of the auto-generated pods, it looks like we don't need blockOwnerDeletion or controller:

  ownerReferences:
  - apiVersion: v1
    kind: Pod
    name: runner-e4wgkz1j-project-42151697-concurrent-0hln2c
    uid: 87426da4-f565-4d91-971a-da13cfe91b76
tspearconquest commented 1 year ago

https://gitlab.com/gitlab-org/gitlab-runner/-/blob/main/executors/kubernetes/kubernetes.go#L1684

tspearconquest commented 1 year ago

I'm probably not strong enough with Go to figure out how to translate the gitlab implementation to work with AKV2K8S, but will see if I can get this working when I have some time.

tspearconquest commented 1 year ago

Fixing this issue by adding ownerReferences pointing to the pod would be a big help for us, but only solves part of the OP.

In the OP, I suggested that these secrets shouldn't be created at all if the Pod doesn't get mutated by the webhook. I'm not clear at the moment on where that might be implemented, but I would appreciate seeing that fixed as well.

tspearconquest commented 1 year ago

After separating out my gitlab job pods into their own namespace which does not contain the label azure-key-vault-env-injection: enabled, I found this is no longer an issue for us because the webhook skips the namespace since those pods don't need injection.

It would be nice to see this fixed, but it's no longer a high priority for us.