knative-extensions / net-istio

A Knative ingress controller for Istio.
Apache License 2.0
76 stars 89 forks source link

Installing KCert into cluster fails with tls: failed to verify certificate: x509: certificate is not valid for any names, but wanted to match webhook.knative-serving.svc #1363

Open tylerhyang opened 2 months ago

tylerhyang commented 2 months ago

I am attempting to install the knative certificate with the following spec:

apiVersion: networking.internal.knative.dev/v1alpha1
kind: Certificate
metadata:
  name: routing-serving-certs
  namespace: knative-serving
  labels:
    networking.knative.dev/ingress-provider: istio
    knative.dev/install-knative-certificate: "true"
    networking.knative.dev/certificate-type: system-internal
    helm.sh/chart: knative-istio-controller-v1.13.1
    app.kubernetes.io/name: knative-istio-controller
    app.kubernetes.io/instance: knative-istio-controller
    app.kubernetes.io/version: "v1.13.1"
    app.kubernetes.io/managed-by: Helm
  annotations:
    networking.knative.dev/certificate.class: cert-manager.certificate.networking.knative.dev
spec:
  dnsNames:
  - kn-routing
  secretName: routing-serving-certs

but, upon triggering an install, I am seeing:

Internal error occurred: failed calling webhook "webhook.serving.knative.dev": failed to call webhook: Post "https://webhook.knative-serving.svc:443/defaulting?timeout=10s": tls: failed to verify certificate: x509: certificate is not valid for any names, but wanted to match webhook.knative-serving.svc

When I look at the knative webhook logs, I see:

{"severity":"ERROR","timestamp":"2024-09-06T00:19:20.074277744Z","logger":"webhook","caller":"webhook/webhook.go:245","message":"http: TLS handshake error from <POD_IP>: EOF\n","commit":"59626f8","knative.dev/pod":"webhook-57b5b4754f-5mv6f","stacktrace":"knative.dev/pkg/webhook.(*zapWrapper).Write\n\tknative.dev/pkg@v0.0.0-20240116073220-b488e7be5902/webhook/webhook.go:245\nlog.(*Logger).output\n\tlog/log.go:245\nlog.(*Logger).Printf\n\tlog/log.go:268\nnet/http.(*Server).logf\n\tnet/http/server.go:3411\nnet/http.(*conn).serve\n\tnet/http/server.go:1930"}

This leads to me have a couple questions: 1a. what is the purpose of this certificate? I see that in later releases like 1.15, this Certificate has been removed but in 1.14 and 1.13 versions, they are still present. Is this safe to remove? I see this comment at the bottom of the spec # The data is populated when system-internal-tls is enabled. but it is disabled by default in 1.13 1b. Are there any components that interact with this Certificate when system-intal-tls is disabled?

skonto commented 3 weeks ago

@ReToCode could you help?

ReToCode commented 3 weeks ago

I don't think these two things are related. The webhook has it's own certificate to ensure TLS between Kubernetes and the webhook pod. It seems to be something wrong there (not sure what from the info given though).

The certificate itself just triggers a call to the webhook for validation which fails, so this is unrelated. To answer the questions, yes it is no longer needed. If you don't use internal-encryption (old) or system-internal-tls you can safely remove it.