Closed bpotaczek closed 3 years ago
Thanks for reporting, by chance are you able to post here the stacktrace?
I didn't see a stacktrace. The logs just end right there when the pod restarts. If you have steps to get one though I'd be happy to.
Does kubectl logs <POD_NAME> --previous
works for you?
I got the same thing (in a slightly different order)
I0823 19:11:17.192874 1 request.go:655] Throttling request took 1.03937764s, request: GET:https://10.100.0.1:443/apis/ratelimit.solo.io/v1alpha1?timeout=32s
2021-08-23T19:11:20.346Z INFO controller-runtime.metrics metrics server is starting to listen {"addr": ":8080"}
2021-08-23T19:11:20.348Z INFO controllers.ScaledObject Running on Kubernetes 1.20+ {"version": "v1.20.7-eks-d88609"}
2021-08-23T19:11:20.349Z INFO setup Starting manager
2021-08-23T19:11:20.349Z INFO setup KEDA Version: 2.4.0
2021-08-23T19:11:20.349Z INFO setup Git Commit:
2021-08-23T19:11:20.349Z INFO setup Go Version: go1.15.13
2021-08-23T19:11:20.349Z INFO setup Go OS/Arch: linux/amd64
I0823 19:11:20.349267 1 leaderelection.go:243] attempting to acquire leader lease keda/operator.keda.sh...
2021-08-23T19:11:20.349Z INFO controller-runtime.manager starting metrics server {"path": "/metrics"}
I0823 19:11:37.765990 1 leaderelection.go:253] successfully acquired lease keda/operator.keda.sh
2021-08-23T19:11:37.766Z INFO controller Starting EventSource {"reconcilerGroup": "keda.sh", "reconcilerKind": "ClusterTriggerAuthentication", "controller": "clustertriggerauthentication", "source": "kind source: /, Kind="}
2021-08-23T19:11:37.766Z DEBUG controller-runtime.manager.events Normal {"object": {"kind":"ConfigMap","namespace":"keda","name":"operator.keda.sh","uid":"1798d9da-8a44-4108-9233-02ad5ed62121","apiVersion":"v1","resourceVersion":"120418249"}, "reason": "LeaderElection", "message": "keda-operator-846b56df59-s257h_6c93590d-0422-4bea-b919-0cdddc268e18 became leader"}
2021-08-23T19:11:37.766Z INFO controller Starting EventSource {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "source": "kind source: /, Kind="}
2021-08-23T19:11:37.766Z INFO controller Starting EventSource {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledJob", "controller": "scaledjob", "source": "kind source: /, Kind="}
2021-08-23T19:11:37.766Z INFO controller Starting EventSource {"reconcilerGroup": "keda.sh", "reconcilerKind": "TriggerAuthentication", "controller": "triggerauthentication", "source": "kind source: /, Kind="}
2021-08-23T19:11:37.866Z INFO controller Starting Controller {"reconcilerGroup": "keda.sh", "reconcilerKind": "ClusterTriggerAuthentication", "controller": "clustertriggerauthentication"}
2021-08-23T19:11:37.866Z INFO controller Starting workers {"reconcilerGroup": "keda.sh", "reconcilerKind": "ClusterTriggerAuthentication", "controller": "clustertriggerauthentication", "worker count": 1}
2021-08-23T19:11:37.866Z INFO controller Starting EventSource {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "source": "kind source: /, Kind="}
2021-08-23T19:11:37.867Z INFO controller Starting Controller {"reconcilerGroup": "keda.sh", "reconcilerKind": "TriggerAuthentication", "controller": "triggerauthentication"}
2021-08-23T19:11:37.867Z INFO controller Starting Controller {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledJob", "controller": "scaledjob"}
2021-08-23T19:11:37.967Z INFO controller Starting Controller {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject"}
2021-08-23T19:11:37.967Z INFO controller Starting workers {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "worker count": 1}
2021-08-23T19:11:37.967Z INFO controllers.ScaledObject Reconciling ScaledObject {"ScaledObject.Namespace": "echo", "ScaledObject.Name": "kafka-kbxtechogrpc-sample"}
2021-08-23T19:11:37.967Z DEBUG controllers.ScaledObject Parsed Group, Version, Kind, Resource {"ScaledObject.Namespace": "echo", "ScaledObject.Name": "kafka-kbxtechogrpc-sample", "GVK": "apps/v1.Deployment", "Resource": "deployments"}
2021-08-23T19:11:37.967Z INFO controllers.ScaledObject Creating a new HPA {"ScaledObject.Namespace": "echo", "ScaledObject.Name": "kafka-kbxtechogrpc-sample", "HPA.Namespace": "echo", "HPA.Name": "keda-hpa-kafka-kbxtechogrpc-sample"}
2021-08-23T19:11:37.967Z INFO controller Starting workers {"reconcilerGroup": "keda.sh", "reconcilerKind": "TriggerAuthentication", "controller": "triggerauthentication", "worker count": 1}
2021-08-23T19:11:37.967Z INFO controller Starting workers {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledJob", "controller": "scaledjob", "worker count": 1}
2021-08-23T19:11:37.967Z DEBUG controller Successfully Reconciled {"reconcilerGroup": "keda.sh", "reconcilerKind": "TriggerAuthentication", "controller": "triggerauthentication", "name": "keda-trigger-auth-kafka-credential", "namespace": "echo"}
2021-08-23T19:11:37.967Z DEBUG controller-runtime.manager.events Normal {"object": {"kind":"TriggerAuthentication","namespace":"echo","name":"keda-trigger-auth-kafka-credential","uid":"32581dc7-4757-4aa5-b752-52da6ebfb170","apiVersion":"keda.sh/v1alpha1","resourceVersion":"120415389"}, "reason": "TriggerAuthenticationAdded", "message": "New TriggerAuthentication configured"}
Hmm, are you able to build KEDA locally? You can follow this guide, but omit step 1. - because you have KEDA already deployed. https://github.com/kedacore/keda/blob/main/BUILD.md#custom-keda-locally-outside-cluster
Basically you scale down the KEDA Operator pod deployed in the cluster to 0 and then will run KEDA locally from your laptop as a standard Go program.
kubectl scale deployment/keda-operator --replicas=0 -n keda
git clone https://github.com/kedacore/keda.git -b v2.4.0
cd keda
make run ARGS="--zap-log-level=debug"
This will start the operator and you should be able to see the full log on your laptop. Your kubeconfig should point to the cluster.
Thanks
I was able to run it locally and I see in the logs it's connecting to kafka and I'm receiving data but it doesn't update the HPA for some reason so I get <unknown>/5
but at least it's running.
{"level":"debug","ts":1630090024.798146,"logger":"kafka_scaler","msg":"Group connect-kbxt-dl-sink-integrations-edi990request-raw-s3 has a lag of 14 for topic integrations.edirouting.cdc.tenderack.v1 and partition 0\n"}
When I try to run it on my cluster it just crashes at that same spot, no error log.
If I run keda without any ScaledObjects on the cluster it runs and waits like it should. It's only after I create the Secret/TriggerAuth/ScaledObject above that is crashes. Still no additional information being logged.
I was able to run it locally and I see in the logs it's connecting to kafka and I'm receiving data but it doesn't update the HPA for some reason so I get
<unknown>/5
but at least it's running.
KEDA Metrics Adapter is running correctly (in the cluster)?
It looks like the pod was being OOM killed. Apparently the default was not enough. It was using about 500MB so I increased the limit to 1Gi and it ran fine. Thanks for the help.
That explains the missing stacktrace, thanks for letting us know.
Report
The keda operator crashes once I apply the file to create my ScaledObject and doesn't create the HPA.
Expected Behavior
I expected the HPA to be created.
Actual Behavior
The operator crashes before the HPA is created.
Steps to Reproduce the Problem
Creating a new HPA
then it crashes and restartsLogs from KEDA operator
KEDA Version
2.4.0
Kubernetes Version
1.20
Platform
Amazon Web Services
Scaler Details
Kafka
Anything else?
Here is the yaml I was using to test with. We only have TLS on the server so I only include the CA data.