aws / eks-charts

Amazon EKS Helm chart repository
Apache License 2.0
1.17k stars 922 forks source link

[aws-sigv4-proxy-admission-controller] Admission controller fails to recover if no pods are available to service requests #1088

Open paulbraham-ds opened 2 months ago

paulbraham-ds commented 2 months ago

Describe the bug A concise description of what the bug is.

By default the aws-sigv4-proxy-admission-controller mutating webhook is applied to all namespaces. This includes the namespace that the controller webhook deployment is running in. This creates an issue whereby if the controller webhook pods fail, it is not possible to start new pods, as there is nothing to service the webhook requests. This ultimately stops any further pods scheduling on a cluster.

Steps to reproduce

Expected outcome A concise description of what you expected to happen. Failure of all pods in the aws-sigv4-proxy-admission-controller deployment should not be unrecoverable. If this namespace is excluded, when the pods can be rescheduled, it will recover.

Environment

Additional Context:

paulbraham-ds commented 1 month ago

Anyone able to review this?