sse-secure-systems / connaisseur

An admission controller that integrates Container Image Signature Verification into a Kubernetes cluster
https://sse-secure-systems.github.io/connaisseur/
Apache License 2.0
436 stars 61 forks source link

Using Sigstore / Cosign validation with 'auth.k8sKeychain.true' for Connaisseur from application version 3.6.0 and chart version 2.6.0 is broken #1766

Open edison-vflow opened 5 hours ago

edison-vflow commented 5 hours ago

Describe the bug

  • When your validator uses auth.k8sKeychain authentication mechanism, this no longer works. When you set your validator to use auth.k8sKeychain.true , the pods throw an error on startup
Starting Connaisseur.
Loading config from /app/config/config.yaml
Error loading config: error parsing file: neither secretName nor useKeychain defined

We were on Connaisseur application version 3.0.0 and chart version 2.0.0 When we upgraded to application version 3.6.0 and chart version 2.6.0, there was this issue of misconfigured secret We reported it and it was resolved in issue https://github.com/sse-secure-systems/connaisseur/issues/1734 via PR https://github.com/sse-secure-systems/connaisseur/pull/1735

Even after this fix , we got the error highlighted above.We tried several things but nothing documented seemed to work.

So we stopped trying to use auth.k8sKeychain.true which we had been using all along in the lower versions i.e Connaisseur application version 3.0.0 and chart version 2.0.0

We started trying to use auth.secretName and thats where we encountered the 2 issues raised here:

  1. https://github.com/sse-secure-systems/connaisseur/issues/1764 ( Does not really impact functionality but we want to know how to resolve the errors in the redis pod logs)
  2. https://github.com/sse-secure-systems/connaisseur/issues/1765 (Critical issue which is a show stopper as the cluster stops working properly when new updated images cant be deployed )

When we had the issue with auth.secretName, we evaluated various approaches we can use to fix the fact that the Connaisseur deployment needs a restart every time the ECR token expires.During the evaluation, we saw that there are hooks into the Connaisseur deployment we will need in order to carry this out.

We then revisited auth.k8sKeychain authentication mechanism again, hoping for better outcome this time around on Connaisseur application version 3.6.1 and chart version 2.6.1 knowing that this used to work in application version 3.0.0 and chart version 2.0.0 but somehow no matter what we do and follow the documentation, it doesn't work anymore.

This time we decided to go all the way and read the code. So we went through latest master codebase , golang file auth.go and we found what the issue is that is causing latest versions that need to use auth.k8sKeychain not to work anymore.

At some point in the code, we changed from auth.k8sKeychain to 'auth.useKeychain.So now instead of settingauth.k8sKeychain.truewe need to set 'auth.useKeychain.true

So there was one bug before in the code where the secret used by the keychain mechanism could not be created, that got fixed in mentioned PR.After this, still keychain mechanism was broken.This time the issue is not the code but we have a bug in our documentation. Our latest documentation was never updated to reflect that the code no longer used auth.k8sKeychain but uses auth.useKeychain

Expected behavior

Optional: To reproduce

To reproduce, install Connaisseur application version 3.6.1 and chart version 2.6.1 on AWS EKS v1.30 Configure your validators section as shown below. ``` application: validators: - name: awsvalidator type: cosign auth: k8sKeychain: true trustRoots: - name: ecr-cosign key: ${container_verification_kms_arn} - name: allow type: static approve: true - name: deny type: static approve: false ``` Observe that you get errors complaining that ``` Error loading config: error parsing file: neither secretName nor useKeychain defined ``` This makes sense because from reading the code, `k8sKeychain` was changed to `useKeychain` We will need to find out when the switch from `auth.k8sKeychain` to `auth.useKeychain` was done as all those versions are affected. 🚨 This looks like a critical omission in the documentation to synchronize the latest code implementation with the latest docs. 🙏 Could we please resolve (update documentation) as a matter of highest priority as this means everyone on the latest editions and trying to upgrade now has this capability broken ! **Optional: Versions (please complete the following information as relevant):** - OS: Amazon Linux - Kubernetes Cluster: EKS 1.30 - Notary Server: - Container registry: containerd - Connaisseur: chart 2.6.1 application 3.6.1 - Other: **Optional: Additional context**
edison-vflow commented 5 hours ago

cc @phbelitz and @chrysogonus