ricoberger / vault-secrets-operator

Create Kubernetes secrets from Vault for a secure GitOps based workflow.
MIT License
630 stars 102 forks source link

error: leader election lost #244

Open jascsch opened 8 months ago

jascsch commented 8 months ago

The vault-secrets-operator container is frequently restarting with the following error messages:

{"level":"error","ts":"2024-01-15T09:46:26Z","logger":"setup","msg":"problem running manager","error":"leader election lost","stacktrace":"main.main\n\t/workspace/main.go:135\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:267"} E0115 09:46:26.613160 1 leaderelection.go:332] error retrieving resource lock vault-secrets-operator/vaultsecretsoperator.ricoberger.de: Get "https://192.168.64.1:443/apis/coordination.k8s.io/v1/namespaces/vault-secrets-operator/leases/vaultsecretsoperator.ricoberger.de": context deadline exceeded {"level":"error","ts":"2024-01-15T08:36:58Z","logger":"setup","msg":"problem running manager","error":"leader election lost","stacktrace":"main.main\n\t/workspace/main.go:135\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:267"} E0115 08:36:58.095320 1 leaderelection.go:332] error retrieving resource lock vault-secrets-operator/vaultsecretsoperator.ricoberger.de: Get "https://192.168.64.1:443/apis/coordination.k8s.io/v1/namespaces/vault-secrets-operator/leases/vaultsecretsoperator.ricoberger.de": context deadline exceeded

Can you please check and advise how to fix this issue?

ricoberger commented 8 months ago

Hi @jascsch, most of the time this is indicates a problem with your Kubernetes API server. There is nothing special how the leader election is handled within the VaultSecrets operator and nothing we can really do here.

We had the same issues with our old Kubernetes provider and decided to run the operator with 1 replica, since it was ok for us when the operator is not available for a short period of time. Maybe this is also a solution for you.

jascsch commented 8 months ago

Hi @ricoberger thanks for the quick feedback. there is nothing we can do about the kubernetes API which is fully managed. We already use 1 replica and the error still occurs. Is there any way to disable the leader election? this should not be needed if only one replica is running.

jascsch commented 8 months ago

Is there a way to add proxy envs? this would be needed for corporate proxy servers if the vault operator communicates with an external vault service.

ricoberger commented 8 months ago

Hi we are using the following values in the Helm chart:

deploymentStrategy:
  type: Recreate

args:
  - -leader-elect=false