emissary-ingress / emissary

open source Kubernetes-native API gateway for microservices built on the Envoy Proxy
https://www.getambassador.io
Apache License 2.0
4.34k stars 681 forks source link

apiext: CA certificate expiration #4442

Open marianafranco opened 2 years ago

marianafranco commented 2 years ago

Describe the bug

Looking at the apiext CA code I noticed that the CA certificate is configured to expire in one year: https://github.com/emissary-ingress/emissary/blob/462c43b3a2e1dda57f0f190733bd0a1e39becbd2/cmd/apiext/ca.go#L33

And the code never tries to recreate it in case the certificate is expired. Will this cause problems to update emissary's resources after this 1 year period?

To Reproduce Steps to reproduce the behavior: N/A

Expected behavior CA certificate is auto rotate before expiration.

Versions (please complete the following information):

Additional context Add any other context about the problem here.

cindymullins-dw commented 1 year ago

@marianafranco, thanks for raising this issue. That's possible I assume. Storage is in v2 so apiext still has to function properly to support resources like Listeners for example that are set to v3alpha1.

ajiteb commented 1 year ago

Is there any solution for this?

cindymullins-dw commented 1 year ago

@ajiteb , @marianafranco, users should proactively renew their certificate as soon as practical. This will create a new certificate with a one year expiration. We will issue a software patch to address this issue well before the one year expiration. Note that certificate renewal will not cause any downtime.To renew your certificate: Run the following command to delete the secret: kubectl delete --all secrets --namespace=emissary-system Restart apiext by running the following command: kubectl rollout restart deploy/emissary-apiext -n emissary-system We are adding this to our docs shortly.

marianafranco commented 1 year ago

Thanks @cindymullins-dw for the update. We have more than 100s clusters so we want to avoid to have to manually delete secrets on those. Do you have an estimate for when this will be fixed on software?

pie-r commented 1 year ago

@cindymullins-dw do you have any estimate on that? I strongly recommend to not recommend a command that delete all the secrets from emissary-system namespace. Instead, share something more safe like k delete secret emissary-ingress-webhook-ca -n emissary-system

marianafranco commented 1 year ago

@cindymullins-dw Any update on the software patch?

isaac88 commented 1 year ago

Hello @cindymullins-dw I suffered today that issue, and it caused a downtime on emissary-ingress pods due: "https://emissary-apiext.kube-system.svc:443/webhooks/crd-convert?timeout=30s": x509: certificate has expired or is not yet valid: current time

Is there a way to avoid install emissary-ingress without the emissary-apiext dependency?

Thanks.

cindymullins-dw commented 1 year ago

HI @isaac88 , there isn't a way to do that currently because apiext translates the API resources and stores them as v2, since we're only allowed one storage version with Kubernetes.

iomarmochtar commented 1 year ago

Our emissary also got this issue, these are the proposed solution:

  1. Create our own self sign certificate with much longer time (>1y) for replacing the builtin contents (emissary-ingress-webhook-ca).
  2. Make the expiry time (since it's creation) is adjustable.

For a moment we might go with the solution number 1. but for the next on, number 2 seems promising. wdyt @cindymullins-dw ?

isaac88 commented 1 year ago

Thanks for the clarification @cindymullins-dw

prudhvireddy123 commented 1 year ago
  [12:41 PM](https://datawire-oss.slack.com/archives/CAULN7S76/p1691737916329369)
E0811 06:18:57.158619       1 reflector.go:138] pkg/kates/client.go:452: Failed to watch *unstructured.Unstructured: failed to list *unstructured.Unstructured: conversion webhook for getambassador.io/v2, Kind=Host failed: Post "[https://emissary-apiext.emissary-system.svc:443/webhooks/crd-convert?timeout=30s](https://emissary-apiext.emissary-system.svc/webhooks/crd-convert?timeout=30s)": x509: certificate has expired or is not yet valid: current time 2023-08-11T06:18:57Z is after 2023-07-29T16:13:54Z

Observed same issue today. Is this patch is ready?

teejaded commented 11 months ago

I just made a cronjob to rotate the certs for us.

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: emissary-apiext-ca-rotate
  labels:
    app.kubernetes.io/instance: emissary-apiext
    app.kubernetes.io/managed-by: kubectl_apply_-f_emissary-apiext.yaml
    app.kubernetes.io/name: emissary-apiext
    app.kubernetes.io/part-of: emissary-apiext
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: emissary-apiext-ca-rotate
  namespace: emissary-system
  labels:
    app.kubernetes.io/instance: emissary-apiext
    app.kubernetes.io/managed-by: kubectl_apply_-f_emissary-apiext.yaml
    app.kubernetes.io/name: emissary-apiext
    app.kubernetes.io/part-of: emissary-apiext
rules:
  - apiGroups: [""]
    resources: ["secrets"]
    resourceNames: ["emissary-ingress-webhook-ca"]
    verbs: ["delete"]
  # allow kubectl rollout restart deployment
  - apiGroups: ["apps", "extensions"]
    resources: ["deployments"]
    resourceNames: ["emissary-apiext"]
    verbs: ["get", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: emissary-apiext-ca-rotate
  namespace: emissary-system
  labels:
    app.kubernetes.io/instance: emissary-apiext
    app.kubernetes.io/managed-by: kubectl_apply_-f_emissary-apiext.yaml
    app.kubernetes.io/name: emissary-apiext
    app.kubernetes.io/part-of: emissary-apiext
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: emissary-apiext-ca-rotate
subjects:
- kind: ServiceAccount
  namespace: emissary-system
  name: emissary-apiext-ca-rotate
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: emissary-apiext-ca-rotate
  labels:
    app.kubernetes.io/instance: emissary-apiext
    app.kubernetes.io/managed-by: kubectl_apply_-f_emissary-apiext.yaml
    app.kubernetes.io/name: emissary-apiext
    app.kubernetes.io/part-of: emissary-apiext
spec:
  schedule: "15 0 1 */3 *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: emissary-apiext-ca-rotate
          containers:
          - name: cert-rotate
            image: alpine:latest
            imagePullPolicy: Always
            command: ["sh", "-c"]
            args:
              - |
                stable=$(wget -qO- https://dl.k8s.io/release/stable.txt)
                wget -O /usr/local/bin/kubectl "https://dl.k8s.io/release/$stable/bin/linux/amd64/kubectl"
                chmod +x /usr/local/bin/kubectl

                kubectl delete secret -n emissary-system emissary-ingress-webhook-ca
                kubectl rollout restart deploy -n emissary-system emissary-apiext
          restartPolicy: OnFailure
ppeble commented 11 months ago

Adding on here. We recently ran into this exact issue and I was forced to manually destroy the secret and redeploy. I would like to ➕ the ask of a software patch to address this in apiext itself.

superfrink commented 8 months ago

It looks like the certificate and key are generated automatically when the certificate does not yet exist. How about something like this change to regenerate the certificate automatically when it has expired?

https://github.com/superfrink/emissary/commit/be8a3f7a5ab12632d28e03dc662d40102774bfb0 (patch drafted, not tested)

Also, is there a desire to keep the old key? Generating a new key means we can remove the certExistsAndNeedsRenew variable and if !certExistsAndNeedsRenew condition.

Chad-Chata commented 8 months ago

PR https://github.com/emissary-ingress/emissary/pull/5489

LanceEa commented 8 months ago

I have gone ahead and landed a fix for this here: https://github.com/emissary-ingress/emissary/pull/5494, please try it out and let us know if you have any issues.

iomarmochtar commented 2 months ago

I have gone ahead and landed a fix for this here: #5494, please try it out and let us know if you have any issues.

@LanceEa thank you, just curious, since as per today the latest release is 3.9.1 (released at Nov 20, 2023) which is not shipped with your changes, is there any step by step guide how to try it ?