loft-sh / jspolicy

jsPolicy - Easier & Faster Kubernetes Policies using JavaScript or TypeScript
https://www.jspolicy.com
Apache License 2.0
353 stars 35 forks source link

[BUG] Defining a `namespaceSelector` in a `jspolicy` object causes an infinite create/destroy loop for the resulting `validatingwebhookconfiguration` #96

Closed spkane closed 1 year ago

spkane commented 1 year ago

This was a tough one...and appears to be a significant bug.

For context, I am deploying jspolicy via ArgoCD (using kustomize's Helm support), and there was some troubleshooting for this issue in the loft.sh Slack.

The conversation can be found here:

We are using the newest v0.2.2 helm chart.

The Issue

Adding a namespaceSelector to a kind: JsPolicy object causes an infinite create/destroy loop for the resulting validatingwebhookconfiguration.

Reproduction Use Case

Create a kind: JsPolicy manifest that includes a namespaceSelector, like so:

apiVersion: policy.jspolicy.com/v1beta1
kind: JsPolicy
metadata:
  name: "deny-seccompprofile"
spec:
  namespaceSelector:
    matchExpressions:
    - key: jspolicy-skip-webhooks
      operator: DoesNotExist
  operations: ["CREATE", "UPDATE"]
  resources: ["pods"]
  javascript: |
    const securityContext = request.object.spec.securityContext || [];

    const allowed_namespaces = [
      "jspolicy",
      "kube-node-lease",
      "kube-public",
      "kube-system"
    ]

    if (!allowed_namespaces.includes(request.namespace)) {
      if (securityContext.seccompProfile) {
        deny("spec.securityContext.seccompProfile not allowed")
      }
    } else {
      if (securityContext.seccompProfile) {
        warn("spec.securityContext.seccompProfile is being utilized")
      }
    }

This should create a loop where running something like kubectl get validatingwebhookconfiguration --watch will result in an endless scroll like this:

deny-seccompprofile-haruh             1          1s
deny-seccompprofile-haruh             1          1s
deny-seccompprofile-haruh             1          1s
deny-seccompprofile-haruh             1          1s
deny-seccompprofile-haruh             1          0s
deny-seccompprofile-haruh             1          0s
deny-seccompprofile-haruh             1          0s

The Workaround

After much investigation, the current workaround is to add an item to the matchExpressions map.

    matchExpressions:
    - key:      control-plane
      operator: DoesNotExist

So, the whole kind: JsPolicy manifest looks like this:

apiVersion: policy.jspolicy.com/v1beta1
kind: JsPolicy
metadata:
  name: "deny-seccompprofile"
spec:
  namespaceSelector:
    matchExpressions:
    - key: jspolicy-skip-webhooks
      operator: DoesNotExist
    - key:      control-plane
      operator: DoesNotExist
  operations: ["CREATE", "UPDATE"]
  resources: ["pods"]
  javascript: |
    const securityContext = request.object.spec.securityContext || [];

    const allowed_namespaces = [
      "jspolicy",
      "kube-node-lease",
      "kube-public",
      "kube-system"
    ]

    if (!allowed_namespaces.includes(request.namespace)) {
      if (securityContext.seccompProfile) {
        deny("spec.securityContext.seccompProfile not allowed")
      }
    } else {
      if (securityContext.seccompProfile) {
        warn("spec.securityContext.seccompProfile is being utilized")
      }
    }

Summary

It appears like the controller (jspolicy pod) is fighting with the kind: JsPolicy object. The jspolicy object defines a single namespaceSelector, yet, the controller wants to add an additional one, which appears to create this infinite loop, unless the kind: JsPolicy object also defines the additional namespaceSelector that the operator is expecting.

benmoss commented 1 year ago

I think the extra wrinkle might be that it's an Azure controller that is adding the control-plane matchExpression, not the jspolicy controller. Seems like this issue on cert-manager is more or less about the same problem: https://github.com/cert-manager/cert-manager/issues/4114

spkane commented 1 year ago

The fix here might simply be to clearly document that this is a known issue on Azure and provide people with the workaround of adding the Azure namespaceSelector in addition to any others that they may want to be defined.