ovn-org / ovn-kubernetes

A robust Kubernetes networking platform
https://ovn-kubernetes.io/
Apache License 2.0
819 stars 338 forks source link

No new leader election occurs after failing to renew lease #3070

Closed chobostar closed 3 months ago

chobostar commented 2 years ago

k8s 1.21

After current leader failed to update lock and stopped with exit 1:

E0715 09:16:20.412178       1 leaderelection.go:325] error retrieving resource lock openshift-ovn-kubernetes/ovn-kubernetes-master: Get "https://api-server.example.com:6443/api/v1/namespaces/openshift-ovn-kubernetes/configmaps/ovn-kubernetes-master": context deadline exceeded
I0715 09:16:20.412219       1 leaderelection.go:278] failed to renew lease openshift-ovn-kubernetes/ovn-kubernetes-master: timed out waiting for the condition
I0715 09:16:20.412262       1 request.go:844] Error in request: resource name may not be empty
E0715 09:16:20.412275       1 leaderelection.go:301] Failed to release lock: resource name may not be empty
I0715 09:16:20.412281       1 master.go:108] No longer leader; exiting

other ovn-masters are still unable to get a lock:

I0715 09:17:33.714123       1 leaderelection.go:346] lock is held by node-123.example.com and has not yet expired
I0715 09:17:33.714145       1 leaderelection.go:248] failed to acquire lease openshift-ovn-kubernetes/ovn-kubernetes-master
I0715 09:17:54.243757       1 leaderelection.go:346] lock is held by node-123.example.com and has not yet expired

on both containers.

after 7 minutes old leader started again and continued processing:

I0715 09:23:32.774174       1 reflector.go:530] k8s.io/client-go/informers/factory.go:134: Watch close - *v1.Endpoints total 331 items received
I0715 09:23:34.173046       1 services_controller.go:219] Processing sync for service my-service-ha on namespace my-namespace 
I0715 09:23:34.173076       1 services_controller.go:254] Creating service service my-service-ha on namespace my-namespace on OVN
I0715 09:23:34.178845       1 services_controller.go:341] Updating service my-service-ha/my-namespace with VIP 172.30.167.253:5432 TCP

It doesn't look like expected behavior. I wonder if someone could clarify possible reasons - why other available instances are unable to become leader?

Thanks for your attention.

github-actions[bot] commented 3 months ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 3 months ago

This issue was closed because it has been stalled for 5 days with no activity.