volcano-sh / volcano

A Cloud Native Batch System (Project under CNCF)
https://volcano.sh
Apache License 2.0
4.24k stars 971 forks source link

Why doesn't the leader election logic run in a loop to allow the container to reattempt leadership without exiting? #3835

Open haoywang48 opened 1 day ago

haoywang48 commented 1 day ago

Please describe your problem in detail

Hi Team,

We've recently encountered an issue related to leader election in the Volcano scheduler and controller. When there's a problem renewing the lease, the current leader logs leaderelection lost and exits the process, as shown in the code snippet below.https://github.com/volcano-sh/volcano/blob/da761e2c0ad4c0b722f45321104241e6f2022918/cmd/scheduler/app/server.go#L139

This behavior raises a couple of questions:

  1. Is this design intentional? Specifically, was it a deliberate decision to exit the process when the leader loses the election?
  2. Can this behavior be adjusted? For example, could the leader election logic run in a loop, allowing the process to attempt to rejoin the election without exiting the container?

We'd like to understand the rationale behind this approach and whether there are recommended workarounds (or potential fixes) to avoid container restarts when a lease renewal fails.

If you'd like to reproduce this issue locally, here’s how you can simulate a failing leader election by blocking traffic to the API server:

Steps to Reproduce

1. Set Up a Kubernetes Cluster:

2. Verify Leader Election is Active:

3. Identify the API Server Endpoint:

4. Simulate Blocking Traffic to the API Server:

5. Observe the Behavior:

6. Restore Traffic to the API Server:

7. Confirm the Outcome:

Looking forward to your insights—thanks for your hard work on Volcano!

Any other relevant information

No response

JesseStutler commented 17 hours ago

Restarting the container is a more common approach, e.g., in kube-controller-manager:

leaderelection.LeaderCallbacks{
            OnStartedLeading: func(ctx context.Context) {
                controllerDescriptors := NewControllerDescriptors()
                if leaderMigrator != nil {
                    // If leader migration is enabled, we should start only non-migrated controllers
                    //  for the main lock.
                    controllerDescriptors = filteredControllerDescriptors(controllerDescriptors, leaderMigrator.FilterFunc, leadermigration.ControllerNonMigrated)
                    logger.Info("leader migration: starting main controllers.")
                }
                controllerDescriptors[names.ServiceAccountTokenController] = saTokenControllerDescriptor
                run(ctx, controllerDescriptors)
            },
            OnStoppedLeading: func() {
                logger.Error(nil, "leaderelection lost")
                klog.FlushAndExit(klog.ExitFlushTimeout, 1)
            },
        })

If you continue to try to obtain a lease, and the slave component is selected as the master at this time, some conflicts may occur, especially if some states of the current container are not cleaned up.