kedacore / charts

Helm charts for KEDA
Apache License 2.0
152 stars 216 forks source link

Handle custom cluster-domain values doesn't work without certmanager #627

Open marandalucas opened 5 months ago

marandalucas commented 5 months ago

@lucchmielowski Hi! Thank you so much for this fix. https://github.com/kedacore/charts/pull/399

Unfortunately, It doesn't work for us.

HELM CONFIG clusterDomain: gcp-prod-pv-na1-a.company.cluster.local

ERROR: W0314 15:03:14.706154 1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local:9666", ServerName: "keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local:9666", }. Err: connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate is valid for keda-operator, keda-operator, keda-operator.keda, keda-operator.keda.svc, keda-operator.keda.svc.cluster.local, keda-admission-webhooks, keda-admission-webhooks.keda, keda-admission-webhooks.keda.svc, keda-admission-webhooks.keda.svc.cluster.local, keda-operator-metrics-apiserver, keda-operator-metrics-apiserver.keda, keda-operator-metrics-apiserver.keda.svc, keda-operator-metrics-apiserver.keda.svc.cluster.local, not keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local"

We wonder if you could fix it. We don't need cert-manager in our clusters.

Thanks in advance

lucchmielowski commented 5 months ago

Hello @marandalucas 👋

I just had a look at it and I've tried re-creating the issue and it seems to be working fine on my side.

Just so I understand:

I'm wondering: what version of the chart are you using and are you using the certificates created by the chart ? (certificates.certManager.enabled: true)

Also, could you share your certificate keda-operator-tls-certificates content ?

marandalucas commented 5 months ago

Hello @lucchmielowski 👍

If you want to recreate the issue you have to:

  1. Create a GKE cluster without the cert-manager tool.
  2. Install KEDA (2.13.0) with a custom clusterDomain.
  3. Check the metric-apiserver pod.
ERROR:
W0314 15:03:14.706154       1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local:9666", ServerName: "keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local:9666", }. 

We'd like to avoid the cert-manager tool installation because of the following reasons:

Is there another way to fix this through parametrizing metrics-service-address or something like that?

Thank you so much for this project

lucchmielowski commented 5 months ago

Hi @marandalucas, sorry but I won't really have the time to test in GKE in the next few days, but both issues you shared looks to be linked to a miss-match between the cluster-domain of your cluster and your configuration and not an issue with the chart itself (I might have misunderstood something though)

What makes me think of that is this part of the log you shared earlier :

certificate is valid for keda-operator, keda-operator, keda-operator.keda, keda-operator.keda.svc, keda-operator.keda.svc.cluster.local ... not keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local

as well as the

addrConn.createTransport failed to connect to {Addr: "keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local:9666...

That does not seem related to a cert issue but more of an addressing issue

Could it be possible that your GKE cluster is using the default svc.cluster.local FQDN ? (in which case you wouldn't need to setup a clusterDomain). One way to check the correct value to use is running the following command that creates a pod and does an nslookup:

kubectl run -it --image=ubuntu --restart=Never shell -- \
sh -c 'apt-get update > /dev/null && apt-get install -y dnsutils > /dev/null && \
nslookup kubernetes.default | grep Name | sed "s/Name:\skubernetes.default//"'`

Also I understand that you don't want to setup certificate-manager, by default the chart enables the operator to create a kedaorg-certs secret that is being created for TLS communication between keda's components.

lucchmielowski commented 5 months ago

Also, feel free to message me on the Kubernetes slack directly if you find it easier to have a "live" discussion about the issue.

JorTurFer commented 5 months ago

Hello @marandalucas , You don't need cert manager, but you need to update the internal cert system too. (you can use cert-manager or the self-generated certs). You have to add an extra arg in the operator k8s-cluster-domain: your-domain. This will take your domain into account for certificate generation.

extraArgs:
  # -- Additional KEDA Operator container arguments
  keda:
    k8s-cluster-domain: your-domain
clusterDomain: your-domain

I guess that we could automatically set the arg with clusterDomain value? 🤔 @lucchmielowski WDYT?

in any case, setting both you will be able to use KEDA without cert-manager.