Open chamindac opened 4 weeks ago
Is this due to AKS kubernetes version compatibility with KEDA version? From documentation here it seems the KEDA add on uses AKS kubernetes 1.30 with KEDA 2.14.. and KEDA 2.15 is to be used in AKS kubernets 1.31
So, when we deploy KEDA to AKS, without using AKS add on for KEDA, should we consider the same versions, as used by add on depending on AKS kubernetes version?
For now as a solution for my problem I am going to stay with KEDA 2.14 until I upgrade my AKS to use kubernetes 1.31, before retrying KEDA 2.15
Hello I can't reproduce the issue. I've included a specific e2e test case to cover it but it passes, this is the trigger configuration): https://github.com/kedacore/keda/blob/fc002f0739b2b7d41160c5192abd6a7fcb1db28c/tests/scalers/azure/azure_event_hub_blob_metadata_wi/azure_event_hub_blob_metadata_wi_test.go#L135-L143
Could you share the blob metadata?
Hi.. below is checkpoint blob metadata
I've found that the error is wrongly handled and that's why you see ghcr.io/kedacore/keda-test:pr-6096-4776d09c8fd761814c1eb9ba7e964ceace651152
.
It's built from main so it's almost v2.15.1. This is the change to improve the info:
I think that with this change we will see extra info about the error
@JorTurFer thank you for response.. I have moved on to use the managed add on for KEDA for AKS. So, I am currently on AKS with kubernetes 1.30.3 with KEDA 2.14.
However I will try to create a test environment and test the fixed version of KEDA and get back to you
@JorTurFer I tried deploying with ghcr.io/kedacore/keda-test:pr-6096-4776d09c8fd761814c1eb9ba7e964ceace651152
using keda-2.15.1.yaml (changing keda operator image as shown below)
image: ghcr.io/kedacore/keda-test:pr-6096-4776d09c8fd761814c1eb9ba7e964ceace651152 # ghcr.io/kedacore/keda:2.15.1 # chaminda
imagePullPolicy: Always
The keda operator crashloopback off with below in logs of keda-operator pod
2024/09/02 09:07:25 maxprocs: Updating GOMAXPROCS=1: determined from CPU quota
2024-09-02T09:07:25Z INFO setup Starting manager
2024-09-02T09:07:25Z INFO setup KEDA Version: pr-6096-4776d09c8fd761814c1eb9ba7e964ceace651152
2024-09-02T09:07:25Z INFO setup Git Commit: 4776d09c8fd761814c1eb9ba7e964ceace651152
2024-09-02T09:07:25Z INFO setup Go Version: go1.22.5
2024-09-02T09:07:25Z INFO setup Go OS/Arch: linux/amd64
2024-09-02T09:07:25Z INFO setup Running on Kubernetes 1.30 {"version": "v1.30.3"}
2024-09-02T09:07:26Z INFO controller-runtime.metrics Starting metrics server
2024-09-02T09:07:26Z INFO controller-runtime.metrics Serving metrics server {"bindAddress": ":8080", "secure": false}
2024-09-02T09:07:26Z INFO starting server {"kind": "health probe", "addr": "[::]:8081"}
I0902 09:07:26.032037 1 leaderelection.go:250] attempting to acquire leader lease keda/operator.keda.sh...
I0902 09:07:41.268027 1 leaderelection.go:260] successfully acquired lease keda/operator.keda.sh
2024-09-02T09:07:41Z INFO Starting EventSource {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "source": "kind source: *v1alpha1.ScaledObject"}
2024-09-02T09:07:41Z INFO Starting EventSource {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "source": "kind source: *v2.HorizontalPodAutoscaler"}
2024-09-02T09:07:41Z INFO Starting Controller {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject"}
2024-09-02T09:07:41Z INFO Starting EventSource {"controller": "triggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "TriggerAuthentication", "source": "kind source: *v1alpha1.TriggerAuthentication"}
2024-09-02T09:07:41Z INFO Starting Controller {"controller": "triggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "TriggerAuthentication"}
2024-09-02T09:07:41Z INFO Starting EventSource {"controller": "scaledjob", "controllerGroup": "keda.sh", "controllerKind": "ScaledJob", "source": "kind source: *v1alpha1.ScaledJob"}
2024-09-02T09:07:41Z INFO Starting Controller {"controller": "scaledjob", "controllerGroup": "keda.sh", "controllerKind": "ScaledJob"}
2024-09-02T09:07:41Z INFO Starting EventSource {"controller": "cloudeventsource", "controllerGroup": "eventing.keda.sh", "controllerKind": "CloudEventSource", "source": "kind source: *v1alpha1.CloudEventSource"}
2024-09-02T09:07:41Z INFO Starting Controller {"controller": "cloudeventsource", "controllerGroup": "eventing.keda.sh", "controllerKind": "CloudEventSource"}
2024-09-02T09:07:41Z INFO Starting EventSource {"controller": "clustertriggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "ClusterTriggerAuthentication", "source": "kind source: *v1alpha1.ClusterTriggerAuthentication"}
2024-09-02T09:07:41Z INFO Starting Controller {"controller": "clustertriggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "ClusterTriggerAuthentication"}
2024-09-02T09:07:41Z INFO Starting EventSource {"controller": "clustercloudeventsource", "controllerGroup": "eventing.keda.sh", "controllerKind": "ClusterCloudEventSource", "source": "kind source: *v1alpha1.ClusterCloudEventSource"}
2024-09-02T09:07:41Z INFO Starting Controller {"controller": "clustercloudeventsource", "controllerGroup": "eventing.keda.sh", "controllerKind": "ClusterCloudEventSource"}
2024-09-02T09:07:41Z INFO Starting EventSource {"controller": "cert-rotator", "source": "kind source: *v1.Secret"}
2024-09-02T09:07:41Z INFO Starting EventSource {"controller": "cert-rotator", "source": "kind source: *unstructured.Unstructured"}
2024-09-02T09:07:41Z INFO Starting EventSource {"controller": "cert-rotator", "source": "kind source: *unstructured.Unstructured"}
2024-09-02T09:07:41Z INFO Starting Controller {"controller": "cert-rotator"}
2024-09-02T09:07:41Z INFO cert-rotation starting cert rotator controller
2024-09-02T09:07:41Z ERROR controller-runtime.source.EventHandler if kind is a CRD, it should be installed before calling Start {"kind": "ClusterCloudEventSource.eventing.keda.sh", "error": "no matches for kind \"ClusterCloudEventSource\" in version \"eventing.keda.sh/v1alpha1\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:53
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:54
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56
2024-09-02T09:07:41Z INFO cert-rotation no cert refresh needed
2024-09-02T09:07:41Z INFO cert-rotation certs are ready in /certs
2024-09-02T09:07:41Z INFO Starting workers {"controller": "cert-rotator", "worker count": 1}
2024-09-02T09:07:41Z INFO cert-rotation no cert refresh needed
2024-09-02T09:07:41Z INFO cert-rotation Ensuring CA cert {"name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration", "name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration"}
2024-09-02T09:07:41Z INFO cert-rotation Ensuring CA cert {"name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService", "name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService"}
2024-09-02T09:07:41Z INFO cert-rotation no cert refresh needed
2024-09-02T09:07:41Z INFO cert-rotation Ensuring CA cert {"name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration", "name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration"}
2024-09-02T09:07:41Z INFO cert-rotation Ensuring CA cert {"name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService", "name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService"}
2024-09-02T09:07:41Z INFO Starting workers {"controller": "triggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "TriggerAuthentication", "worker count": 1}
2024-09-02T09:07:41Z INFO Starting workers {"controller": "scaledjob", "controllerGroup": "keda.sh", "controllerKind": "ScaledJob", "worker count": 1}
2024-09-02T09:07:41Z INFO Starting workers {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "worker count": 5}
2024-09-02T09:07:41Z INFO Starting workers {"controller": "cloudeventsource", "controllerGroup": "eventing.keda.sh", "controllerKind": "CloudEventSource", "worker count": 1}
2024-09-02T09:07:41Z INFO Starting workers {"controller": "clustertriggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "ClusterTriggerAuthentication", "worker count": 1}
2024-09-02T09:07:42Z INFO cert-rotation CA certs are injected to webhooks
2024-09-02T09:07:42Z INFO grpc_server Starting Metrics Service gRPC Server {"address": ":9666"}
2024-09-02T09:07:51Z ERROR controller-runtime.source.EventHandler if kind is a CRD, it should be installed before calling Start {"kind": "ClusterCloudEventSource.eventing.keda.sh", "error": "no matches for kind \"ClusterCloudEventSource\" in version \"eventing.keda.sh/v1alpha1\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:87
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:88
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56
2024-09-02T09:08:01Z ERROR controller-runtime.source.EventHandler if kind is a CRD, it should be installed before calling Start {"kind": "ClusterCloudEventSource.eventing.keda.sh", "error": "no matches for kind \"ClusterCloudEventSource\" in version \"eventing.keda.sh/v1alpha1\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:87
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:88
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56
2024-09-02T09:08:11Z ERROR controller-runtime.source.EventHandler if kind is a CRD, it should be installed before calling Start {"kind": "ClusterCloudEventSource.eventing.keda.sh", "error": "no matches for kind \"ClusterCloudEventSource\" in version \"eventing.keda.sh/v1alpha1\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:87
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:88
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56
2024-09-02T09:08:21Z ERROR controller-runtime.source.EventHandler if kind is a CRD, it should be installed before calling Start {"kind": "ClusterCloudEventSource.eventing.keda.sh", "error": "no matches for kind \"ClusterCloudEventSource\" in version \"eventing.keda.sh/v1alpha1\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:87
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:88
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56
2024-09-02T09:08:31Z ERROR controller-runtime.source.EventHandler if kind is a CRD, it should be installed before calling Start {"kind": "ClusterCloudEventSource.eventing.keda.sh", "error": "no matches for kind \"ClusterCloudEventSource\" in version \"eventing.keda.sh/v1alpha1\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:87
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:88
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56
2024-09-02T09:08:41Z ERROR controller-runtime.source.EventHandler if kind is a CRD, it should be installed before calling Start {"kind": "ClusterCloudEventSource.eventing.keda.sh", "error": "no matches for kind \"ClusterCloudEventSource\" in version \"eventing.keda.sh/v1alpha1\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:87
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:88
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56
2024-09-02T09:08:51Z ERROR controller-runtime.source.EventHandler if kind is a CRD, it should be installed before calling Start {"kind": "ClusterCloudEventSource.eventing.keda.sh", "error": "no matches for kind \"ClusterCloudEventSource\" in version \"eventing.keda.sh/v1alpha1\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:87
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:88
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56
2024-09-02T09:09:01Z ERROR controller-runtime.source.EventHandler if kind is a CRD, it should be installed before calling Start {"kind": "ClusterCloudEventSource.eventing.keda.sh", "error": "no matches for kind \"ClusterCloudEventSource\" in version \"eventing.keda.sh/v1alpha1\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:87
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:88
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56
2024-09-02T09:09:11Z ERROR controller-runtime.source.EventHandler if kind is a CRD, it should be installed before calling Start {"kind": "ClusterCloudEventSource.eventing.keda.sh", "error": "no matches for kind \"ClusterCloudEventSource\" in version \"eventing.keda.sh/v1alpha1\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:87
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:88
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56
2024-09-02T09:09:21Z ERROR controller-runtime.source.EventHandler if kind is a CRD, it should be installed before calling Start {"kind": "ClusterCloudEventSource.eventing.keda.sh", "error": "no matches for kind \"ClusterCloudEventSource\" in version \"eventing.keda.sh/v1alpha1\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:87
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:88
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56
2024-09-02T09:09:31Z ERROR controller-runtime.source.EventHandler if kind is a CRD, it should be installed before calling Start {"kind": "ClusterCloudEventSource.eventing.keda.sh", "error": "no matches for kind \"ClusterCloudEventSource\" in version \"eventing.keda.sh/v1alpha1\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:87
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:88
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56
2024-09-02T09:09:41Z ERROR Could not wait for Cache to sync {"controller": "clustercloudeventsource", "controllerGroup": "eventing.keda.sh", "controllerKind": "ClusterCloudEventSource", "error": "failed to wait for clustercloudeventsource caches to sync: timed out waiting for cache to be synced for Kind *v1alpha1.ClusterCloudEventSource"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:203
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:208
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234
sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1
/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/manager/runnable_group.go:223
2024-09-02T09:09:41Z INFO Stopping and waiting for non leader election runnables
2024-09-02T09:09:41Z INFO Stopping and waiting for leader election runnables
2024-09-02T09:09:41Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "clustertriggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "ClusterTriggerAuthentication"}
2024-09-02T09:09:41Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "cloudeventsource", "controllerGroup": "eventing.keda.sh", "controllerKind": "CloudEventSource"}
2024-09-02T09:09:41Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject"}
2024-09-02T09:09:41Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "scaledjob", "controllerGroup": "keda.sh", "controllerKind": "ScaledJob"}
2024-09-02T09:09:41Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "triggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "TriggerAuthentication"}
2024-09-02T09:09:41Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "cert-rotator"}
2024-09-02T09:09:41Z INFO cert-rotation stopping cert rotator controller
W0902 09:09:41.269909 1 reflector.go:462] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: watch of admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
W0902 09:09:41.269969 1 reflector.go:462] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: watch of apiregistration.k8s.io/v1, Kind=APIService ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
W0902 09:09:41.270024 1 reflector.go:462] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: watch of *v1.Secret ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
2024-09-02T09:09:41Z INFO All workers finished {"controller": "cert-rotator"}
2024-09-02T09:09:41Z INFO All workers finished {"controller": "clustertriggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "ClusterTriggerAuthentication"}
2024-09-02T09:09:41Z INFO All workers finished {"controller": "cloudeventsource", "controllerGroup": "eventing.keda.sh", "controllerKind": "CloudEventSource"}
2024-09-02T09:09:41Z INFO All workers finished {"controller": "scaledjob", "controllerGroup": "keda.sh", "controllerKind": "ScaledJob"}
2024-09-02T09:09:41Z INFO All workers finished {"controller": "triggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "TriggerAuthentication"}
2024-09-02T09:09:41Z INFO All workers finished {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject"}
2024-09-02T09:09:41Z INFO Stopping and waiting for caches
W0902 09:09:41.270185 1 reflector.go:462] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: watch of *v1alpha1.ClusterTriggerAuthentication ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
W0902 09:09:41.270224 1 reflector.go:462] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: watch of *v1alpha1.CloudEventSource ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
W0902 09:09:41.270262 1 reflector.go:462] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: watch of *v1alpha1.ScaledJob ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
W0902 09:09:41.270297 1 reflector.go:462] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: watch of *v1alpha1.ScaledObject ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
W0902 09:09:41.270339 1 reflector.go:462] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: watch of *v1alpha1.TriggerAuthentication ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
W0902 09:09:41.270400 1 reflector.go:462] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: watch of *v2.HorizontalPodAutoscaler ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
2024-09-02T09:09:41Z INFO Stopping and waiting for webhooks
2024-09-02T09:09:41Z INFO Stopping and waiting for HTTP servers
2024-09-02T09:09:41Z INFO shutting down server {"kind": "health probe", "addr": "[::]:8081"}
2024-09-02T09:09:41Z INFO controller-runtime.metrics Shutting down metrics server with timeout of 1 minute
2024-09-02T09:09:41Z INFO Wait completed, proceeding to shutdown the manager
2024-09-02T09:09:41Z ERROR setup problem running manager {"error": "failed to wait for clustercloudeventsource caches to sync: timed out waiting for cache to be synced for Kind *v1alpha1.ClusterCloudEventSource"}
main.main
/workspace/cmd/operator/main.go:329
runtime.main
/usr/local/go/src/runtime/proc.go:271
Oh, sorry, we introduced a new CRD (that'll be ship with v2.16), this is the CRD that you need to deploy into the cluster too -> https://github.com/kedacore/keda/blob/main/config/crd/bases/eventing.keda.sh_clustercloudeventsources.yaml It's for the CloudEvent integration, so probably it doesn't matter in your case xD
@JorTurFer with the CRD deployed now keda operator seems to be needning some additional permissions
"system:serviceaccount:keda:keda-operator" is the service account I am using for enabling workload identity. With this CRD does the workload identity require any additional permisions in Azure resources or for AKS cluster?
2024-09-03T09:51:19Z INFO cert-rotation Ensuring CA cert {"name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration", "name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration"}
2024-09-03T09:51:19Z INFO cert-rotation Ensuring CA cert {"name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService", "name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService"}
2024-09-03T09:51:19Z INFO cert-rotation no cert refresh needed
2024-09-03T09:51:19Z INFO cert-rotation Ensuring CA cert {"name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration", "name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration"}
2024-09-03T09:51:19Z INFO cert-rotation Ensuring CA cert {"name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService", "name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService"}
2024-09-03T09:51:20Z INFO cert-rotation CA certs are injected to webhooks
2024-09-03T09:51:20Z INFO grpc_server Starting Metrics Service gRPC Server {"address": ":9666"}
W0903 09:51:20.604804 1 reflector.go:539] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: failed to list *v1alpha1.ClusterCloudEventSource: clustercloudeventsources.eventing.keda.sh is forbidden: User "system:serviceaccount:keda:keda-operator" cannot list resource "clustercloudeventsources" in API group "eventing.keda.sh" at the cluster scope
E0903 09:51:20.604845 1 reflector.go:147] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: Failed to watch *v1alpha1.ClusterCloudEventSource: failed to list *v1alpha1.ClusterCloudEventSource: clustercloudeventsources.eventing.keda.sh is forbidden: User "system:serviceaccount:keda:keda-operator" cannot list resource "clustercloudeventsources" in API group "eventing.keda.sh" at the cluster scope
W0903 09:51:22.747786 1 reflector.go:539] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: failed to list *v1alpha1.ClusterCloudEventSource: clustercloudeventsources.eventing.keda.sh is forbidden: User "system:serviceaccount:keda:keda-operator" cannot list resource "clustercloudeventsources" in API group "eventing.keda.sh" at the cluster scope
E0903 09:51:22.747830 1 reflector.go:147] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: Failed to watch *v1alpha1.ClusterCloudEventSource: failed to list *v1alpha1.ClusterCloudEventSource: clustercloudeventsources.eventing.keda.sh is forbidden: User "system:serviceaccount:keda:keda-operator" cannot list resource "clustercloudeventsources" in API group "eventing.keda.sh" at the cluster scope
W0903 09:51:26.531065 1 reflector.go:539] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: failed to list *v1alpha1.ClusterCloudEventSource: clustercloudeventsources.eventing.keda.sh is forbidden: User "system:serviceaccount:keda:keda-operator" cannot list resource "clustercloudeventsources" in API group "eventing.keda.sh" at the cluster scope
E0903 09:51:26.531111 1 reflector.go:147] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:105: Failed to watch *v1alpha1.ClusterCloudEventSource: failed to list *v1alpha1.ClusterCloudEventSource: clustercloudeventsources.eventing.keda.sh is forbidden: User "system:serviceaccount:keda:keda-operator" cannot list resource "clustercloudeventsources" in API group "eventing.keda.sh" at the cluster scope
yes, I forgot it, sorry. This permissions have to be added to KEDA's Cluster Role as it needs read the CRD
@JorTurFer With the changes you mentioned above, I managed to run keda operator with your tag ghcr.io/kedacore/keda-test:pr-6096-4776d09c8fd761814c1eb9ba7e964ceace651152
usingkeda-2.15.1.yaml
The issue seems to be with 2.15.1
the keda-operator and the event hub trigger is looking for none existing checkpoint blob.
For example here are my two scaled jobs current checkpoint blobs
largepreview-scaledjob
There is no checkpoint/7
blob but scaleedjob trigger and keda operator is looking for such a blob
As per scaled job log it is looking for checkpoint blob 7
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal KEDAScalersStarted 25m (x4 over 25m) scale-handler Scaler azure-eventhub is built.
Normal KEDAScalersStarted 25m scale-handler Started scalers watch
Normal ScaledJobReady 25m keda-operator ScaledJob is ready for scaling
Warning KEDAScalerFailed 25m (x2 over 25m) scale-handler unable to get runtimeInfo for metrics: context canceled
Warning KEDAScalerFailed 19m scale-handler unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: GET https://myehnstoragename.blob.core.windows.net/largepreviewgenerator-largepreviewrequired/ch-eh-dev-euw-001-2-green.servicebus.windows.net/largepreviewrequired/largepreviewgenerator/checkpoint/7
--------------------------------------------------------------------------------
RESPONSE 404: 404 The specified blob does not exist.
ERROR CODE: BlobNotFound
--------------------------------------------------------------------------------
<?xml version="1.0" encoding="utf-8"?><Error><Code>BlobNotFound</Code><Message>The specified blob does not exist.
RequestId:56c49771-901e-0026-5c78-ff2f40000000
Time:2024-09-05T09:44:10.8104812Z</Message></Error>
--------------------------------------------------------------------------------
Warning KEDAScalerFailed 19m scale-handler unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: GET https://myehnstoragename.blob.core.windows.net/largepreviewgenerator-largepreviewrequired/ch-eh-dev-euw-001-2-green.servicebus.windows.net/largepreviewrequired/largepreviewgenerator/checkpoint/7
largevideo-scaledjob
There is no checkpoint/0
blob but scaleedjob trigger and keda operator is looking for such a blob
As per scaled job log it is looking for checkpoint blob 0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal KEDAScalersStarted 34m (x4 over 34m) scale-handler Scaler azure-eventhub is built.
Normal KEDAScalersStarted 34m scale-handler Started scalers watch
Normal ScaledJobReady 34m keda-operator ScaledJob is ready for scaling
Warning KEDAScalerFailed 34m (x2 over 34m) scale-handler unable to get runtimeInfo for metrics: context canceled
Warning KEDAScalerFailed 27m scale-handler unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: GET https://myehnstoragename.blob.core.windows.net/largevideogenerator-largevideogenerationrequired/ch-eh-dev-euw-001-2-green.servicebus.windows.net/largevideogenerationrequired/largevideogenerator/checkpoint/0
--------------------------------------------------------------------------------
RESPONSE 404: 404 The specified blob does not exist.
ERROR CODE: BlobNotFound
--------------------------------------------------------------------------------
<?xml version="1.0" encoding="utf-8"?><Error><Code>BlobNotFound</Code><Message>The specified blob does not exist.
RequestId:36a080bb-701e-0063-5178-fffaa3000000
Time:2024-09-05T09:44:47.6370308Z</Message></Error>
--------------------------------------------------------------------------------
Warning KEDAScalerFailed 26m scale-handler unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: GET https://myehnstoragename.blob.core.windows.net/largevideogenerator-largevideogenerationrequired/ch-eh-dev-euw-001-2-green.servicebus.windows.net/largevideogenerationrequired/largevideogenerator/checkpoint/0
--------------------------------------------------------------------------------
RESPONSE 404: 404 The specified blob does not exist.
ERROR CODE: BlobNotFound
Both of the scaled jobs showing same symptoms only with 2.15.1 and failing by looking for none existing checkpoint blob name. The keda operator (with tag ghcr.io/kedacore/keda-test:pr-6096-4776d09c8fd761814c1eb9ba7e964ceace651152
) shows below logs for the scaled jobs agian showning looking for none exsiting blobs
2024-09-05T09:50:22Z ERROR scale_handler Error getting scaler metrics and activity, but continue {"scaledJob.Name": "largevideo-scaledjob", "Scaler": "*scalers.azureEventHubScaler:", "error": "unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: GET https://myehnstoragename.blob.core.windows.net/largevideogenerator-largevideogenerationrequired/ch-eh-dev-euw-001-2-green.servicebus.windows.net/largevideogenerationrequired/largevideogenerator/checkpoint/0\n--------------------------------------------------------------------------------\nRESPONSE 404: 404 The specified blob does not exist.\nERROR CODE: BlobNotFound\n--------------------------------------------------------------------------------\n<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>BlobNotFound</Code><Message>The specified blob does not exist.\nRequestId:dfd7075d-101e-0007-1779-ff0b3b000000\nTime:2024-09-05T09:50:22.6388386Z</Message></Error>\n--------------------------------------------------------------------------------\n"}
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).getScaledJobMetrics
/workspace/pkg/scaling/scale_handler.go:853
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).isScaledJobActive
/workspace/pkg/scaling/scale_handler.go:897
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScalers
/workspace/pkg/scaling/scale_handler.go:262
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).startScaleLoop
/workspace/pkg/scaling/scale_handler.go:182
2024-09-05T09:50:22Z INFO scaleexecutor Scaling Jobs {"scaledJob.Name": "largevideo-scaledjob", "scaledJob.Namespace": "mynamespace", "Number of running Jobs": 0}
2024-09-05T09:50:22Z INFO scaleexecutor Scaling Jobs {"scaledJob.Name": "largevideo-scaledjob", "scaledJob.Namespace": "mynamespace", "Number of pending Jobs": 0}
2024-09-05T09:50:25Z ERROR scale_handler Error getting scaler metrics and activity, but continue {"scaledJob.Name": "largepreview-scaledjob", "Scaler": "*scalers.azureEventHubScaler:", "error": "unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: GET https://myehnstoragename.blob.core.windows.net/largepreviewgenerator-largepreviewrequired/ch-eh-dev-euw-001-2-green.servicebus.windows.net/largepreviewrequired/largepreviewgenerator/checkpoint/7\n--------------------------------------------------------------------------------\nRESPONSE 404: 404 The specified blob does not exist.\nERROR CODE: BlobNotFound\n--------------------------------------------------------------------------------\n<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>BlobNotFound</Code><Message>The specified blob does not exist.\nRequestId:7f71945f-f01e-0052-6279-ff1bb0000000\nTime:2024-09-05T09:50:25.6750071Z</Message></Error>\n--------------------------------------------------------------------------------\n"}
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).getScaledJobMetrics
/workspace/pkg/scaling/scale_handler.go:853
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).isScaledJobActive
/workspace/pkg/scaling/scale_handler.go:897
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScalers
/workspace/pkg/scaling/scale_handler.go:262
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).startScaleLoop
/workspace/pkg/scaling/scale_handler.go:182
2024-09-05T09:50:25Z INFO scaleexecutor Scaling Jobs {"scaledJob.Name": "largepreview-scaledjob", "scaledJob.Namespace": "mynamespace", "Number of running Jobs": 0}
2024-09-05T09:50:25Z INFO scaleexecutor Scaling Jobs {"scaledJob.Name": "largepreview-scaledjob", "scaledJob.Namespace": "mynamespace", "Number of pending Jobs": 0}
2024-09-05T09:50:27Z ERROR scale_handler Error getting scaler metrics and activity, but continue {"scaledJob.Name": "largevideo-scaledjob", "Scaler": "*scalers.azureEventHubScaler:", "error": "unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: GET https://myehnstoragename.blob.core.windows.net/largevideogenerator-largevideogenerationrequired/ch-eh-dev-euw-001-2-green.servicebus.windows.net/largevideogenerationrequired/largevideogenerator/checkpoint/0\n--------------------------------------------------------------------------------\nRESPONSE 404: 404 The specified blob does not exist.\nERROR CODE: BlobNotFound\n--------------------------------------------------------------------------------\n<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>BlobNotFound</Code><Message>The specified blob does not exist.\nRequestId:ba2246b8-801e-0005-6179-ffb583000000\nTime:2024-09-05T09:50:27.6339979Z</Message></Error>\n--------------------------------------------------------------------------------\n"}
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).getScaledJobMetrics
/workspace/pkg/scaling/scale_handler.go:853
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).isScaledJobActive
/workspace/pkg/scaling/scale_handler.go:897
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScalers
/workspace/pkg/scaling/scale_handler.go:262
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).startScaleLoop
/workspace/pkg/scaling/scale_handler.go:182
When I deploy KEDA with keda-2.14.1.yaml
(or 2.14.2
helm chart or 2.14.3
helm chart) and setup everything else (triggers, scaled job setting) configured exactly same way, there is no such logs for looking for unavaialbe blobs for checkpoints. My scale jobs working as expected without any issues with exact same setup with KEDA 2.14.x.
The issue above is only happening with 2.15.1
I suspect KEDA 2.15.1 is not refreshing the storage check point blob list correctly before checking for checkpoint blob metadata. While 2.14.x KEDA is not having the problem.
We introduced a bug when we upgrade the SDK but I think that this PR will solve the issue -> https://github.com/kedacore/keda/pull/6096
Are you willing to test the fix? This is the tag with the fix -> ghcr.io/kedacore/keda-test:pr-6096-9b8be4868a27c304646cf8cb0735357eb272bd38
@JorTurFer The PR #6096 seems to have fixed the issue. I deployed ghcr.io/kedacore/keda-test:pr-6096-9b8be4868a27c304646cf8cb0735357eb272bd38
to my environment with keda-2.15.1.yaml
, and for last 24 hours eventhub scaler works as expected.
Will you be releasing a fixed version for 2.15.x
or is this issue going to be fixed only with 2.16.x
. Just want to know if we will have to skip using 2.15.x
(it is impossible to use with this issue) and wait for 2.16.x
?
I have keda deployed with version v2.15.1 on AKS using work load identity. AKS k8s version is 1.29.7. My scaled job trigges based on azure event hub. Keda operator shows issue "unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: %!w()"
The setup was working fine with KEDA v2.14.2 on AKS using work load identity. AKS k8s version is 1.29.7.
Scled job shows below issues
The keda operator pod log shows below
If I deploy KEDA v2.14.2 or v2.14.3 on top of v2.15.1 without changing anything else in my setup everything starts to work fine. and status of my scaled job comes back to normal as below log shows.
Below are more information on my setup.
I deployed keda using below
KEDA triiger auth setup as
My scaled job triggers
I can provide more information and logs if required.
In summary this is what happens