Open fischerman opened 4 years ago
Another approach would be to use object selectors. The failure policy could be set to Failure
but only prevents pods which need the agent from being creating during a failure. We could even differentiate between optional and non-optional injections.
This would require K8s 1.15 (maybe as an opt-in switch) and at least one label (instead of an annotation). On the upside, no code changes are required.
The injector is important for Pods with annotations
vault.hashicorp.com/...
. To make the injector a less critical component for the cluster, theFailurePolicy
for the webhook should be set toIgnore
(which is the case in the Helm deployment).If the injector is unavailable, pods which need the agent will be created but probably fail to run properly. Liveness and readiness probes will not help in this case -- they do not recreate pods. Without looking closely at the resulting pod spec, the only indication for the cause is a log line from the api-server. Metrics are only available for webhooks with a
FailurePolicy
set toFail
.This issue is to discuss approaches to monitor and/or re-conciliate unavailabilities of the webhook. Here is one approach:
vault.hashicorp.com/agent-inject: "true"
but are missingvault.hashicorp.com/agent-inject-status: injected
and expose a metric for them (e.g. to trigger alerts)I'm sure there are many other approaches. Would be interested to hear them!