[feature]support rescheduling when deleting a cluster

karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration

https://karmada.io

Apache License 2.0

4.11k stars 805 forks source link

[feature]support rescheduling when deleting a cluster #1383

Closed huone1 closed 2 years ago

huone1 commented 2 years ago

Signed-off-by: huone1 huwanxing@huawei.com

What type of PR is this?

/kind feature

What this PR does / why we need it: support rescheduling when deleting a cluster Which issue(s) this PR fixes: Fixes #829 Fixes #1411

Special notes for your reviewer: This PR does not consider the spreadconstraints and it will be considered in the scheduler refactoring . Does this PR introduce a user-facing change?:

`karmada-scheduler`: Workload is now able to reschedule after the cluster is unregistered.

huone1 commented 2 years ago

/cc @dddddai

huone1 commented 2 years ago

/cc @mrlihanbo

mrlihanbo commented 2 years ago

notes: This PR does not consider the spreadconstraints. /lgtm

RainbowMango commented 2 years ago

Please solve the conflict. @huone1

huone1 commented 2 years ago

issue #1411 is also fixed in the PR

huone1 commented 2 years ago

Please solve the conflict. @huone1

ok, it is done

RainbowMango commented 2 years ago

cc @dddddai @Garrybest Can you take a look?

karmada-bot commented 2 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/karmada-io/karmada/blob/master/OWNERS)~~ [RainbowMango] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment

Garrybest commented 2 years ago

/assign

Garrybest commented 2 years ago

/lgtm

Garrybest commented 2 years ago

I think there is a bug here. Imagine:

Failover is disabled.
A cluster is not ready.

We should not erase the scheduling replicas of the not-ready cluster in RB because Failover is disabled. However, if it happens to scale up, e.g. the user scale up desired replicas. This PR will delete all replicas in this cluster which is dangerous. PTAL, @huone1

huone1 commented 2 years ago

when Failover is disabled, the rescheduling for not-ready cluster is not triggered; if A cluster is not ready and unhealthy, i think it is normal to delete all replicas in this cluster

Garrybest commented 2 years ago

I'm afraid not. When scaling up, rescheduling will be triggered.

huone1 commented 2 years ago

I'm afraid not. When scaling up, rescheduling will be triggered.

what is wrong with it

RainbowMango commented 2 years ago

However, if it happens to scale up, e.g. the user scale up desired replicas. This PR will delete all replicas in this cluster which is dangerous.

Why is it dangerous?

What I'm thinking about is how to postpone the deletion operation until the desired replicas are all in the available state. That guarantees there are always sufficient replicas running at any time.

Garrybest commented 2 years ago

Failover is disabled, but under this circumstance all replicas will be removed. It does not match the expectation.

Garrybest commented 2 years ago

Failover is disabled because the user does not want to remove all replicas when a cluster is not ready. If the api-server of a member cluster is temporarily down and Failover is disabled, we potentially do not want the replicas in this member cluster to be evicted when the user triggers scaling up. However, now it does not match the expectation.

RainbowMango commented 2 years ago

Agree with @Garrybest. We should consider the false positive cluster failure seriously. Can we hold the replicas(for the un-healthy cluster) unchanged even in the case of scaling up and decreasing the replicas in the case of scaling down.?

huone1 commented 2 years ago

it is different between deleting a cluster and the change from healthy to unhealthy。it is reasonable to migrate the replicas to Other Clusters when deleting a cluster

RainbowMango commented 2 years ago

Yeah, deleting a cluster is another story.

Garrybest commented 2 years ago

Let me describe an example.

We have a deployment with a 5 desired replicas which are propagated to member cluster A.
The cluster is a little bit busy so the connection between karmada-agent and kube-apiserber of member A is lost for about 1 minute. So the cluster status is unhealthy.
It happens that the deployment has been scaled up to 10 replicas at this time.
Even if we disable Failover, the scaling up triggers the scheduling procedure, now the karmada-scheduler will remove replicas in cluster A which may be dangerous. However, it is possible that there is nothing wrong with the 5 replicas in member cluster A.

RainbowMango commented 2 years ago

Thanks @Garrybest for the details. Can you help file an issue to track this?

Garrybest commented 2 years ago

Sure.