alauda / captain

A Helm 3 Controller
Apache License 2.0
185 stars 46 forks source link

Have to periodically restart captain pod to refresh Webhook TLS #62

Open lewismarshall opened 4 years ago

lewismarshall commented 4 years ago

Issue seems related to #39 but can be fixed by restarting the captain pod:

2020/05/06 09:37:27 http: TLS handshake error from 10.154.0.39:42920: remote error: tls: bad certificate
2020/05/06 09:37:27 http: TLS handshake error from 10.154.0.39:42922: remote error: tls: bad certificate
E0506 09:37:27.292953       1 controller.go:275] error syncing 'default/nginx-ingress': Internal error occurred: failed calling webhook "mutate-helmrequest.app.alauda.io": Post https://captain-webhook.captain-system.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority, requeuing
2020-05-06T09:37:27.293Z    DEBUG   controller-runtime.manager.events   Warning {"object": {"kind":"HelmRequest","namespace":"default","name":"nginx-ingress","uid":"913adf6e-3f7c-4224-9cd6-58b442798a46","apiVersion":"app.alauda.io/v1alpha1","resourceVersion":"485785"}, "reason": "FailedDelete", "message": "Delete HelmRequest nginx-ingress error : Internal error occurred: failed calling webhook \"mutate-helmrequest.app.alauda.io\": Post https://captain-webhook.captain-system.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority"}

Deployed from manifest https://github.com/alauda/captain/blob/v1.0.1/artifacts/all/deploy.yaml after editing the captain image to use v1.0.1.

hangyan commented 4 years ago

I will look into it... I have tested it before, but failed to reproduce this error. Captain generated cert are approved by kubernetes cluster use thr CSR resouce.

hangyan commented 4 years ago

You can remove all the captain webhooks for now, it does not affect the core function of captain. This should avoid using of the cert