knative / serving

Kubernetes-based, scale-to-zero, request-driven compute
https://knative.dev/docs/serving/
Apache License 2.0
5.57k stars 1.16k forks source link

Certificate is valid for net-istio-webhook, ... not webhook.knative-serving.svc #14786

Closed gabbler97 closed 5 months ago

gabbler97 commented 10 months ago

I have installed knative to AWS EKS v1.26.8

helm history knative-operator
REVISION        UPDATED                         STATUS          CHART                   APP VERSION     DESCRIPTION
1               Wed Jan 10 14:56:29 2024        deployed        knative-operator-1.12.0                 Install complete

I had this for knative-serving

apiVersion: operator.knative.dev/v1beta1
kind: KnativeServing
metadata:
  name: knative-serving
  namespace: knative-serving
spec:
  config:
    istio:
      local-gateway.knative-serving.knative-local-gateway: knative-local-gateway.istio-system.svc.cluster.local
  registry:
    override:
      activator: my.artifactory/gcr-remote/knative-releases/knative.dev/serving/cmd/activator
      autoscaler: my.artifactory/gcr-remote/knative-releases/knative.dev/serving/cmd/autoscaler
      controller: my.artifactory/gcr-remote/knative-releases/knative.dev/serving/cmd/controller
      webhook: my.artifactory/gcr-remote/knative-releases/knative.dev/net-istio/cmd/webhook
      autoscaler-hpa: my.artifactory/gcr-remote/knative-releases/knative.dev/serving/cmd/autoscaler-hpa
      net-istio-controller/controller: my.artifactory/gcr-remote/knative-releases/knative.dev/net-istio/cmd/controller
      net-istio-webhook/webhook: my.artifactory/gcr-remote/knative-releases/knative.dev/net-istio/cmd/webhook
      queue-proxy: my.artifactory/gcr-remote/knative-releases/knative.dev/serving/cmd/queue
      migrate: my.artifactory/gcr-remote/knative-releases/knative.dev/pkg/apiextensions/storageversion/cmd/migrate
    imagePullSecrets:
    - name: mypullsecret

I had this for knative-eventing

apiVersion: operator.knative.dev/v1beta1
kind: KnativeEventing
metadata:
  name: knative-eventing
  namespace: knative-eventing
spec:
  registry:
    override:
      eventing-controller/eventing-controller: my.artifactory/gcr-remote/knative-releases/knative.dev/eventing/cmd/controller
      eventing-webhook/eventing-webhook: my.artifactory/gcr-remote/knative-releases/knative.dev/eventing/cmd/webhook
      imc-controller/controller: my.artifactory/gcr-remote/knative-releases/knative.dev/eventing/cmd/in_memory/channel_controller
      imc-dispatcher/dispatcher: my.artifactory/gcr-remote/knative-releases/knative.dev/eventing/cmd/in_memory/channel_dispatcher
      mt-broker-controller/mt-broker-controller: my.artifactory/gcr-remote/knative-releases/knative.dev/eventing/cmd/mtchannel_broker
      mt-broker-filter/filter: my.artifactory/gcr-remote/knative-releases/knative.dev/eventing/cmd/broker/filter
      mt-broker-ingress/ingress: my.artifactory/gcr-remote/knative-releases/knative.dev/eventing/cmd/broker/ingress
      pingsource-mt-adapter/dispatcher: my.artifactory/gcr-remote/knative-releases/knative.dev/eventing/cmd/mtping
    imagePullSecrets:
    - name: mypullsecret

Everything looks good

kubectl get knativeserving -n knative-serving && kubectl get knativeeventing -n knative-eventing
NAME              VERSION   READY   REASON
knative-serving   1.12.0    True
NAME               VERSION   READY   REASON
knative-eventing   1.12.0    True

kubectl get pod -n knative-serving && kubectl get pod -n knative-eventing
NAME                                    READY   STATUS    RESTARTS   AGE
activator-f65f4c95d-z7gmq               1/1     Running   0          114m
autoscaler-65665d448-x5lps              1/1     Running   0          114m
autoscaler-hpa-677fb74dbf-xbxmc         1/1     Running   0          113m
controller-75dcb55dcd-s9jp7             1/1     Running   0          109m
net-istio-controller-6659d45cb6-q2jmj   1/1     Running   0          113m
net-istio-webhook-8df4594bf-vdnfl       1/1     Running   0          113m
webhook-764984f6b9-pqdng                1/1     Running   0          114m
NAME                                    READY   STATUS    RESTARTS   AGE
eventing-controller-748b9bff5-7v96w     1/1     Running   0          29m
eventing-webhook-648c8db866-fshrm       1/1     Running   0          29m
imc-controller-64447476bb-x7t8d         1/1     Running   0          29m
imc-dispatcher-77d9f969bf-mh5zp         1/1     Running   0          29m
mt-broker-controller-7f5585658b-jz65z   1/1     Running   0          28m
mt-broker-filter-58447d6f9b-pmlpc       1/1     Running   0          29m
mt-broker-ingress-78bb975fc-hhfhz       1/1     Running   0          28m

Istio installed and running

 helm list -n istio-system && kubectl get pod -n istio-system
NAME                    NAMESPACE       REVISION        UPDATED                                 STATUS          CHART           APP VERSION
istio-base              istio-system    1               2024-01-10 14:56:32.960778122 +0000 UTC deployed        base-1.20.0     1.20.0
istio-ingressgateway    istio-system    1               2024-01-10 16:11:35.782924971 +0000 UTC deployed        gateway-1.20.0  1.20.0
istiod                  istio-system    1               2024-01-10 14:56:29.815459709 +0000 UTC deployed        istiod-1.20.0   1.20.0
NAME                                    READY   STATUS    RESTARTS   AGE
istio-ingressgateway-5c89f67dbc-6dmrw   1/1     Running   0          43h
istiod-867848875c-76bjp                 1/1     Running   0          44h

When I try to create a function I got this:

func --namespace my-ns deploy --registry my.artifactory/my-docker-dev-local/knative-test-go-test
Warning: function has current image 'my.artifactory/my-docker-dev-local/knative-test-go' which has a different registry than the currently configured registry 'my.artifactory/my-docker-dev-local/knative-test-go-test'. The new image tag will be 'my.artifactory/my-docker-dev-local/knative-test-go-test/knative-test-go'.  To use an explicit image, use --image.
Warning: namespace chosen is 'my-ns', but currently active namespace is 'default'. Continuing with deployment to 'my-ns'.
function up-to-date. Force rebuild with --build
Pushing function image to the registry "my.artifactory" using the "my-ci-user" user credentials
⬆️  Deploying function to the cluster
deploy error: knative deployer failed to deploy the Knative Service: Internal error occurred: failed calling webhook "webhook.serving.knative.dev": failed to call webhook: Post "https://webhook.knative-serving.svc:443/?timeout=10s": tls: failed to verify certificate: x509: certificate is valid for net-istio-webhook, net-istio-webhook.knative-serving, net-istio-webhook.knative-serving.svc, net-istio-webhook.knative-serving.svc.cluster.local, not webhook.knative-serving.svc
Error: knative deployer failed to deploy the Knative Service: Internal error occurred: failed calling webhook "webhook.serving.knative.dev": failed to call webhook: Post "https://webhook.knative-serving.svc:443/?timeout=10s": tls: failed to verify certificate: x509: certificate is valid for net-istio-webhook, net-istio-webhook.knative-serving, net-istio-webhook.knative-serving.svc, net-istio-webhook.knative-serving.svc.cluster.local, not webhook.knative-serving.svc

When I switch to public images

kubectl get pod -n knative-eventing -o yaml | grep image: && kubectl get pod -n knative-serving -o yaml | grep image: image: gcr.io/knative-releases/knative.dev/eventing/cmd/controller@sha256:bda5dd2e4a3b67cbbee4170919a6e34d92dd35d5bc232b907f88069c845db74d image: sha256:36d85806f565c86fe2e8db989cba2555a0047e038e346e3faeb16830c93e1204 image: gcr.io/knative-releases/knative.dev/eventing/cmd/webhook@sha256:f8269a953534c4549b74305e53a1d5b02aa14fb05754d65b051f79426439cb60 image: sha256:6720f10533ab150a5452b749e7d2b149a21e8422f999d54d5a247dcdd14323ce image: gcr.io/knative-releases/knative.dev/eventing/cmd/in_memory/channel_controller@sha256:22b94a6356b270bf43e79816c06a82b209f44f67ec7e6d7e05d5ba4e20a9c134 image: sha256:33e37b7ce54d951728328ae746c5419ef7abb232c6566ea091835e18bb7f5d35 image: gcr.io/knative-releases/knative.dev/eventing/cmd/in_memory/channel_dispatcher@sha256:7f3e9b74fed3ec1d8d5ed5e8dee852873eb19684bab9121a13317834ab5646da image: sha256:4a450446462ae2b57e1458ec4f7337a6f2e1749cc9aad307439ca8fbb7486b9b image: gcr.io/knative-releases/knative.dev/eventing/cmd/mtchannel_broker@sha256:c0a6c410b37b9551b36be700c6a85d2419fb10341a6b1d0e78ece75aca5313d0 image: sha256:7df7161a47a8b6309b7a66abee64ae9c82495c5ce8b64b8a51c6d7ee8694fe19 image: gcr.io/knative-releases/knative.dev/eventing/cmd/broker/filter@sha256:27a6bfe21fe0b304fd56ceb2e6f9d0ac506a98afe587bfdd1a2993b8d0461132 image: sha256:dcbaf8c102e807267b25199e47b6f425a76f53f9ecac77beaa815386c7ad68af image: gcr.io/knative-releases/knative.dev/eventing/cmd/broker/ingress@sha256:10ef656d45c59304189294a0f60c8d0648b5722ca2c503fa03b1499e0a727758 image: sha256:61ff279e7b247237a917a5ebf963a0bfc063911d21579aadd6107c1022051ec6 image: gcr.io/knative-releases/knative.dev/pkg/apiextensions/storageversion/cmd/migrate@sha256:4728878ecefec5b7f1e7745c36b3e092923a896f743365ab8a77d9790d108ffa image: sha256:5787002e29cbe7676975d474a3923084ccbc4862881f9831a56e7aede5799a98 image: gcr.io/knative-releases/knative.dev/serving/cmd/activator@sha256:d54c43aeac6a22464bc9e3e9e131397daace467278965e94e3651b56c5fab71b image: sha256:acf194b26b8f00404d9a64684e10b10a85f196cb5d6441d71bdf8e51f232d138 image: gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler@sha256:348d8d1bf9684150815265ac335d5ae137f2de0b3e1a5313ea08f80c4a6841f4 image: sha256:41eb93c41861b3ac4978dc3632fda0cc89a6c5cefd10d443b985b7cbdeac4fd8 image: gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa@sha256:03466e3b78f2ae45ed905b7eaaac73c6365967462d27d0098a3b12b18ab11bd9 image: sha256:024e6082396b6efc04efe7bc819ac0102dfac545ba974da69116974e337203a2 image: gcr.io/knative-releases/knative.dev/serving/cmd/controller@sha256:858415fc0b769fc450407a806036051cbb0f3dcbf19062572bcc0c5a454d0de4 image: sha256:6f1bcb562fc24e40d491dfe4d64ee2bcc390a2efd5e85e529dbfa607a681a864 image: gcr.io/knative-releases/knative.dev/net-istio/cmd/controller@sha256:db922edc67f4c9ff7f821354ad2f39968733c00c3d2ef9b2b8289630c9508fb9 image: sha256:dab6009ec33aa8375abf6dd909ecf539ffe5d6e419a18b6c09d0b05785543b64 image: gcr.io/knative-releases/knative.dev/net-istio/cmd/webhook@sha256:d1c7111853283218afb0b4fd14ad69087f609a956b431387b48da4c7c0430711 image: sha256:55c251e7fef0a7317bc4a0a61b0ec22f0ac827f05b3c4817efd8fbd7c74e8270 image: gcr.io/knative-releases/knative.dev/pkg/apiextensions/storageversion/cmd/migrate@sha256:5b8e6519a1a7f14555bbdf33f7621883624bf9aaad940c88319258a949274d18 image: sha256:8e0f84c5b80b4fe8b7d56931e1dcc7c71bc053b7d2a1f5d55dbe82a3f4be6b6d image: gcr.io/knative-releases/knative.dev/serving/cmd/webhook@sha256:1a87934978e46ac11508d561458d25bec82a80b27b87e7114922defb189b5588 image: sha256:3a558b7418125c97fc4babd4542689d01d38c162ad6934cee2331052d6fbd290

It can be deployed without an issue:

func --namespace my-ns deploy --registry my.artifactory/my-docker-dev-local/knative-test-go-test Warning: function has current image 'my.artifactory/my-docker-dev-local/knative-test-go' which has a different registry than the currently configured registry 'my.artifactory/my-docker-dev-local/knative-test-go-test'. The new image tag will be 'my.artifactory/my-docker-dev-local/knative-test-go-test/knative-test-go'. To use an explicit image, use --image. Warning: namespace chosen is 'my-ns', but currently active namespace is 'default'. Continuing with deployment to 'my-ns'. Building function image Still building Still building Yes, still building 🙌 Function built: my.artifactory/my-docker-dev-local/knative-test-go-test/knative-test-go Pushing function image to the registry "my.artifactory" using the "my-ci-user" user credentials ⬆️ Deploying function to the cluster ✅ Function deployed in namespace "my-ns" and exposed at URL: http://hello.my-ns.svc.cluster.local


What is missing here?
Thank you very much in advance!
ReToCode commented 10 months ago

Hm something is definitely off with your setup:

webhook.serving.knative.dev should point to the pod webhook in knative-serving namespace. The response looks like you get it from net-istio-webhook which is the wrong pod.

Can you check your services in knative-serving and make sure you have the ones from the 1.12 install manifests?

dprotaso commented 10 months ago

The response looks like you get it from net-istio-webhook which is the wrong pod.

I think the net-istio webhook doesn't add any filtering when the webhook rules are constructed. So any config map changes in knative-serving will invoke that webhook. vice-versa for the serving webhook trying to validate the the config-istio. They should be noops though since the names are not recognized.

 rules:
  - apiGroups:
    - ""
    apiVersions:
    - v1
    operations:
    - CREATE
    - UPDATE
    resources:
    - configmaps/*
    scope: Namespaced

Unsure if that's a factor here - I'm assuming no because this just works fine upstream.

nguyenthai0107 commented 9 months ago

Hello @gabbler97 Where did you get helm chart for knative operator ? Is that came from knative-opẻator ? Thanks

nguyenthai0107 commented 9 months ago

Hello @gabbler97 Can i know how exactly you installed knative-operator by helm ? I mean is there anyway to install on publish repo helm such as "Artifact Hub" ( use helm pull, helm install ... ) or should be install by OLM ? thank you.

gabbler97 commented 9 months ago

Dear @nguyenthai0107 ! Sorry for not responding for a while! I have installed this version: https://artifacthub.io/packages/olm/community-operators/knative-operator/1.12.0 Thank you for your response!

nguyenthai0107 commented 9 months ago

hello @gabbler97 thank you for your reply. I mean that is there anyway to install knative operator without OLM. The link you sent to me should install OLM, and then deploy manifest for install Knative Operator. image Regards.

skonto commented 9 months ago

@gabbler97 hi, any update on this one? Is this still an issue?

gabbler97 commented 9 months ago

Hi @skonto! Yes it is still an issue. Hi @nguyenthai0107! We use Terraform and we install every operator and features with helm charts, It would be great to follow and not break this methodology. And sorry for my previous answer I installed knative-operator from here https://github.com/knative/operator/tree/main/config/charts/knative-operator. This exact version https://github.com/knative/operator/releases/tag/knative-v1.12.0 Sorry for the confusion, my bad.

github-actions[bot] commented 6 months ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

dprotaso commented 6 months ago

This was reported again today - also using the operator - I'm a bit stumped here

https://cloud-native.slack.com/archives/C04LGHDR9K7/p1716399266262429

dprotaso commented 5 months ago

If you dig into the thread you'll see for some reason the knative serving webhook is serving the certificate for the net-istio-webhook.

I can't repro this but I also don't see how this would be possible - since each webhook watches a difference secret. We also validated that the knative webhook secret has the correct host names.

If anyone has a reliable way of reproducing this that would help greatly

dprotaso commented 5 months ago

Ok @gabbler97 circling back with someone in slack with the same issue we figured it out. You've configured your registry overrides incorrectly

-webhook: my.artifactory/gcr-remote/knative-releases/knative.dev/net-istio/cmd/webhook
+webhook: my.artifactory/gcr-remote/knative-releases/knative.dev/serving/cmd/webhook

You're using the net-istio-webhook image in place of the serving webhook. You'll need to update that entry

everything else looks good