Open jordojordo opened 1 year ago
Thanks Jordon for documenting this!
Yes, indeed this is a problem.
A policy could be incorrectly created if it has incorrect Rules (for example it doesn't target anything), or incorrect Settings (it fails to parse its configuration).
Checking for incorrect rules can be done in creation or edit of ClusterAdmissionPolicies, AdmissionPolicies. This happens in kubewarden-controller, and is tackled by https://github.com/kubewarden/kubewarden-controller/issues/361.
Checking for incorrect settings means trying to instantiate the policy in the policy-server and running its validate_settings
function callback. With the current policy-server, this means that the policy-server will error and stop instantiating the rest of policies assigned to it, cascading the error upwards. This is intentional, as we want to fail closed so no failing policy is silently bypassed. I'm not sure we should change this default.
Note also that if policies were already correctly instantiated prior to failing, the previous version of the policy-server deployment will still be active and the cluster will be secure.
I think it would be cool to have a behavior similar to the one of Kubernetes scheduler:
When a policy is in crash state the policy server will not host it and the "low level" (Validating|Mutating)WebhookConfiguration
should be removed to prevent a denial of service inside of the cluster.
Sounds like something worth a RFC
Is there an existing issue for this?
Current Behavior
If a policy has been created with incorrect settings/rules this policy will stay in a
pending
status which is normal behavior. If you attempt to create another policy with acceptable settings that targets the same Policy Server as the original it will never become active as the policy server deployment is stuck in a crash loop.Expected Behavior
I would expect to see the new policy with acceptable settings to become active while the original policy with incorrect settings would remain pending.
Steps To Reproduce
verify-image-signatures
with a rule set targetingaddons
for the resources and provide nosettings
Environment
Anything else?
Policy Server logs and both related policies ( Note the
bad-policy
policy was created before theallow-privilege-escalation-psp
policy )Policy Server deployment with related pods
Policy Server pod which is failing