kedacore / keda

KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes
https://keda.sh
Apache License 2.0
8.11k stars 1.02k forks source link

Keda installation in another namespace with helm chart version >2.12 gives an error #5943

Open rupertgti opened 3 weeks ago

rupertgti commented 3 weeks ago

Report

We install the Keda helm chart in another namespace (called keda) and in the last versions of helm chart we receive this errors in the pod keda-operator-metrics-apiserver

1 requestheader_controller.go:193] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLEBINDING_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'

1 main.go:254] "msg"="unable to run external metrics adapter" "error"="unable to load configmap based request-header-client-ca-file: configmaps \"extension-apiserver-authentication\" is forbidden: User \"system:serviceaccount:keda:keda-metrics-server\" cannot get resource \"configmaps\" in API group \"\" in the namespace \"kube-system\"" "logger"="keda_metrics_adapter"

Steps to Reproduce the Problem:

Use helm chart version of keda >2.12 in another namespace

KEDA Version 2.13.0 or 2.14.0

Kubernetes Version 1.30

Platform Amazon Web Services

Expected Behavior

Pods up ;)

Actual Behavior

Pod keda-operator-metrics-apiserver enter in back-off and see these logs:

1 requestheader_controller.go:193] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLEBINDING_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'

1 main.go:254] "msg"="unable to run external metrics adapter" "error"="unable to load configmap based request-header-client-ca-file: configmaps \"extension-apiserver-authentication\" is forbidden: User \"system:serviceaccount:keda:keda-metrics-server\" cannot get resource \"configmaps\" in API group \"\" in the namespace \"kube-system\"" "logger"="keda_metrics_adapter"

Steps to Reproduce the Problem

1.Use helm chart version of keda >2.12 in another namespace in a k8s cluster

Logs from KEDA operator

2024-07-03T12:30:12Z    ERROR   cert-rotation   Error updating webhook with certificate {"name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration", "error": "validatingwebhookconfigurations.admissionregistration.k8s.io \"keda-admission\" is forbidden: User \"system:serviceaccount:keda:keda-operator\" cannot update resource \"validatingwebhookconfigurations\" in API group \"admissionregistration.k8s.io\" at the cluster scope"}
github.com/open-policy-agent/cert-controller/pkg/rotator.(*ReconcileWH).ensureCerts
    /workspace/vendor/github.com/open-policy-agent/cert-controller/pkg/rotator/rotator.go:839
github.com/open-policy-agent/cert-controller/pkg/rotator.(*ReconcileWH).Reconcile
    /workspace/vendor/github.com/open-policy-agent/cert-controller/pkg/rotator/rotator.go:785
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
2024-07-03T12:30:12Z    INFO    cert-rotation   Ensuring CA cert    {"name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService", "name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService"}
2024-07-03T12:30:12Z    ERROR   cert-rotation   Error updating webhook with certificate {"name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService", "error": "apiservices.apiregistration.k8s.io \"v1beta1.external.metrics.k8s.io\" is forbidden: User \"system:serviceaccount:keda:keda-operator\" cannot update resource \"apiservices\" in API group \"apiregistration.k8s.io\" at the cluster scope"}
github.com/open-policy-agent/cert-controller/pkg/rotator.(*ReconcileWH).ensureCerts
    /workspace/vendor/github.com/open-policy-agent/cert-controller/pkg/rotator/rotator.go:839
github.com/open-policy-agent/cert-controller/pkg/rotator.(*ReconcileWH).Reconcile
    /workspace/vendor/github.com/open-policy-agent/cert-controller/pkg/rotator/rotator.go:785
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
2024-07-03T12:30:12Z    ERROR   Reconciler error    {"controller": "cert-rotator", "object": {"name":"kedaorg-certs","namespace":"keda"}, "namespace": "keda", "name": "kedaorg-certs", "reconcileID": "d6a936ac-8633-4a49-b43d-1dc00a2341e1", "error": "apiservices.apiregistration.k8s.io \"v1beta1.external.metrics.k8s.io\" is forbidden: User \"system:serviceaccount:keda:keda-operator\" cannot update resource \"apiservices\" in API group \"apiregistration.k8s.io\" at the cluster scope"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
2024-07-03T12:30:12Z    INFO    cert-rotation   no cert refresh needed
2024-07-03T12:30:12Z    INFO    cert-rotation   Ensuring CA cert    {"name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration", "name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration"}
2024-07-03T12:30:12Z    ERROR   cert-rotation   Error updating webhook with certificate {"name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration", "error": "validatingwebhookconfigurations.admissionregistration.k8s.io \"keda-admission\" is forbidden: User \"system:serviceaccount:keda:keda-operator\" cannot update resource \"validatingwebhookconfigurations\" in API group \"admissionregistration.k8s.io\" at the cluster scope"}
github.com/open-policy-agent/cert-controller/pkg/rotator.(*ReconcileWH).ensureCerts
    /workspace/vendor/github.com/open-policy-agent/cert-controller/pkg/rotator/rotator.go:839
github.com/open-policy-agent/cert-controller/pkg/rotator.(*ReconcileWH).Reconcile
    /workspace/vendor/github.com/open-policy-agent/cert-controller/pkg/rotator/rotator.go:785
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
2024-07-03T12:30:12Z    INFO    cert-rotation   Ensuring CA cert    {"name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService", "name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService"}
2024-07-03T12:30:12Z    ERROR   cert-rotation   Error updating webhook with certificate {"name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService", "error": "apiservices.apiregistration.k8s.io \"v1beta1.external.metrics.k8s.io\" is forbidden: User \"system:serviceaccount:keda:keda-operator\" cannot update resource \"apiservices\" in API group \"apiregistration.k8s.io\" at the cluster scope"}
github.com/open-policy-agent/cert-controller/pkg/rotator.(*ReconcileWH).ensureCerts
    /workspace/vendor/github.com/open-policy-agent/cert-controller/pkg/rotator/rotator.go:839
github.com/open-policy-agent/cert-controller/pkg/rotator.(*ReconcileWH).Reconcile
    /workspace/vendor/github.com/open-policy-agent/cert-controller/pkg/rotator/rotator.go:785
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
2024-07-03T12:41:58Z    ERROR   Reconciler error    {"controller": "cert-rotator", "object": {"name":"kedaorg-certs","namespace":"keda"}, "namespace": "keda", "name": "kedaorg-certs", "reconcileID": "be4cccfd-3a39-4e1f-9aa6-1a6eba1afb58", "error": "apiservices.apiregistration.k8s.io \"v1beta1.external.metrics.k8s.io\" is forbidden: User \"system:serviceaccount:keda:keda-operator\" cannot update resource \"apiservices\" in API group \"apiregistration.k8s.io\" at the cluster scope"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
W0703 12:42:02.602748       1 reflector.go:535] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: failed to list *v1alpha1.ClusterTriggerAuthentication: clustertriggerauthentications.keda.sh is forbidden: User "system:serviceaccount:keda:keda-operator" cannot list resource "clustertriggerauthentications" in API group "keda.sh" at the cluster scope
E0703 12:42:02.602830       1 reflector.go:147] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: Failed to watch *v1alpha1.ClusterTriggerAuthentication: failed to list *v1alpha1.ClusterTriggerAuthentication: clustertriggerauthentications.keda.sh is forbidden: User "system:serviceaccount:keda:keda-operator" cannot list resource "clustertriggerauthentications" in API group "keda.sh" at the cluster scope

KEDA Version

2.13.0

Kubernetes Version

1.29

Platform

Amazon Web Services

Scaler Details

No response

Anything else?

No response

rupertgti commented 3 weeks ago

Maybe the problem will be fixed if namespace name is not hardcoded and if you add: {{ .Release.Namespace }} here:

https://github.com/kedacore/charts/blob/v2.14.2/keda/templates/metrics-server/clusterrolebinding.yaml#L34 and https://github.com/kedacore/charts/blob/v2.14.2/keda/templates/metrics-server/clusterrolebinding.yaml#L62

aflorvexcel commented 3 weeks ago

Same issue for us. We used a diferent namespace to kube-system.

imo-ininder commented 3 weeks ago

Same issue here. We used a different namespace, and this error was shown.

tgmatt commented 2 weeks ago

For what it's worth, while we're having a separate problem, probably because we're using EKS 1.30, deploying chart version 2.14.2 in the keda namespace works fine. We used the terraform helm provider's helm_release to deploy it and it gets past this stage for sure. I won't hijack this thread with our issue, just thought I'd help :)

rupertgti commented 2 weeks ago

Hi @tgmatt, thank you for your comment, did you use some special values for this? could you paste it?

tgmatt commented 2 weeks ago

Hi @tgmatt, thank you for your comment, did you use some special values for this? could you paste it?

Of course, this is what we did:

resource "helm_release" "keda" {
  name             = "keda"
  chart            = "keda"
  repository       = "https://kedacore.github.io/charts"
  namespace        = "keda"
  version          = "2.14.2"
  create_namespace = true

  values = [
    "${file("${path.module}/cluster_trigger_authentication.yml")}"
  ]

  set {
    name  = "podIdentity.aws.irsa.enabled"
    value = "true"
  }
  set {
    name  = "podIdentity.aws.irsa.roleArn"
    value = module.keda-irsa.iam_role_arn
  }
}

The included values file just includes a definition for the ClusterTriggerAuthentication as it doesn't appear to get created automatically for some reason. I can share that if you want the keda operator to monitor queues instead of workload roles.

rupertgti commented 2 weeks ago

In our case we don't need in AWS a role specific because don't need an authentication between namespaces, but it's curious that works in your case with terraform and a helm apply classic deployment in our case fail in version up from 2.12. Maybe terraform manage the values for the clusterrole installed by a different way, I don't know :(