gocrane / crane

Crane is a FinOps Platform for Cloud Resource Analytics and Economics in Kubernetes clusters. The goal is not only to help users to manage cloud cost easier but also ensure the quality of applications.
https://gocrane.io
Apache License 2.0
1.83k stars 377 forks source link

craned - Leader election lost #896

Open thomasklosinsky opened 5 months ago

thomasklosinsky commented 5 months ago

Describe the bug kubectl logs -n crane-system craned-id-xx

I0212 18:49:02.120733 1 nodelocal.go:25] Registering node local metrics collector cpu I0212 18:49:02.120904 1 nodelocal.go:25] Registering node local metrics collector cpuLoad I0212 18:49:02.120918 1 nodelocal.go:25] Registering node local metrics collector diskio I0212 18:49:02.120928 1 nodelocal.go:25] Registering node local metrics collector memory I0212 18:49:02.120938 1 nodelocal.go:25] Registering node local metrics collector netio I0212 18:49:02.169577 1 predictor.go:121] predictors map[dsp:0xc00027f110 percentile:0xc0004d7200] I0212 18:49:02.175321 1 webhook.go:103] Succeed to setup autoscaling webhook I0212 18:49:02.176252 1 manager.go:128] Recommendation Config updated. I0212 18:49:02.176547 1 manager.go:84] Start watching /tmp/recommendation-config/config.yaml for update. I0212 18:49:02.181380 1 predictor.go:141] predictor manager started, all predictors started I0212 18:49:02.181428 1 prediction.go:151] predictor Periodic started I0212 18:49:02.181466 1 prediction.go:302] predictor Percentile started I0212 18:49:02.181502 1 leaderelection.go:248] attempting to acquire leader lease crane-system/craned... I0212 18:49:02.189362 1 server.go:94] install crane api server middleware: log I0212 18:49:02.189397 1 server.go:94] install crane api server middleware: cors I0212 18:49:02.189408 1 server.go:94] install crane api server middleware: recovery I0212 18:49:02.190170 1 server.go:149] Start to listening on http address: 0.0.0.0:8082 I0212 18:49:19.272285 1 leaderelection.go:258] successfully acquired lease crane-system/craned E0212 18:49:52.480248 1 leaderelection.go:330] error retrieving resource lock crane-system/craned: Get "https://10.96.0.1:443/api/v1/namespaces/crane-system/configmaps/craned": context deadline exceeded I0212 18:49:52.480435 1 leaderelection.go:283] failed to renew lease crane-system/craned: timed out waiting for the condition E0212 18:49:52.480533 1 manager.go:444] "problem running crane manager" err="leader election lost" F0212 18:49:52.480570 1 manager.go:445] leader election lost

Dashboard is accessable, but no Endpoint can be selected. Service IP or local IP can not be accessed. No logs in craned when trying to connect to cluster.

craned is always restarting...

Reproduce steps install Crane in running Kubernetes Cluster (v 1.28.2)

Expected behavior craned electing leader without problems

Environment (please complete the following information):