karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.41k stars 871 forks source link

NoSchedule taint for Cluster object not works #4952

Open chaosi-zju opened 4 months ago

chaosi-zju commented 4 months ago

What happened:

I add a NoSchedule taint to member1 Cluster as:

taints:
- effect: NoSchedule
  key: workload-rebalancer-test
  timeAdded: "2024-05-16T12:21:31Z"

Then I create a new deployment and propagate it to member1 and member2 cluster by a dynamic weight Policy whose clusterTolerations is defined as:

clusterTolerations:
- effect: NoSchedule
  key: workload-rebalancer-test
  operator: Exists
  tolerationSeconds: 0

Since member1 cluster has NoSchedule taint, it should be all propagated to member2 cluster, but the actual result is both member1 and member2 cluster been propagated.

What you expected to happen:

the replicas should be all propagated to member2 cluster.

How to reproduce it (as minimally and precisely as possible):

1)add NoSchedule taint to member1 cluster

kubectl --context karmada-apiserver patch cluster member1 --type='json' -p '[{"op": "replace", "path": "/spec/taints", "value": [{"key": "workload-rebalancer-test", "effect": "NoSchedule"}]}]'

2)write following yaml to local file resource.yaml

resource.yaml ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: demo-deploy-1 labels: app: test spec: replicas: 3 selector: matchLabels: app: demo-deploy-1 template: metadata: labels: app: demo-deploy-1 spec: terminationGracePeriodSeconds: 0 containers: - image: nginx name: demo-deploy-1 resources: limits: cpu: 10m memory: 10Mi --- apiVersion: policy.karmada.io/v1alpha1 kind: ClusterPropagationPolicy metadata: name: default-pp spec: placement: clusterTolerations: - effect: NoSchedule key: workload-rebalancer-test operator: Exists tolerationSeconds: 0 clusterAffinity: clusterNames: - member1 - member2 replicaScheduling: replicaDivisionPreference: Weighted replicaSchedulingType: Divided weightPreference: dynamicWeight: AvailableReplicas resourceSelectors: - apiVersion: apps/v1 kind: Deployment name: demo-deploy-1 namespace: default ```

3)check the schedule result in binding.

I0516 13:04:56.030578       1 event.go:376] "Event occurred" object="default/demo-deploy-1" fieldPath="" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScheduleBindingSucceed" message="Binding has been scheduled successfully. Result: {member2:2, member1:1}"

Anything else we need to know?:

Environment:

chaosi-zju commented 4 months ago

We can take a look together

CC @XiShanYongYe-Chang

XiShanYongYe-Chang commented 4 months ago

I understand that this should not be expected.

dominicqi commented 4 months ago

I don't quite understand. Isn't this saying that the taint is tolerated? Scheduling it should be a normal situation, right? If we don't declare tolerance, everything will be scheduled to member2.

chaosi-zju commented 4 months ago

Hi @dominicqi

I don't quite understand. Isn't this saying that the taint is tolerated? Scheduling it should be a normal situation, right? If we don't declare tolerance, everything will be scheduled to member2.

No, the policy is:

clusterTolerations:
- effect: NoSchedule
  key: workload-rebalancer-test
  operator: Exists
  tolerationSeconds: 0

In here, tolerationSeconds: 0 means we do not tolerate the workload-rebalancer-test:NoSchedule taint.

Since cluster member1 has workload-rebalancer-test:NoSchedule taint, and we do not tolerate it, so all replicas should be schedule to member2.

However, now, we still schedule replicas to member1 cluster, which means the taint is not work, and it is not expected.

dominicqi commented 4 months ago

Hi @chaosi-zju

Hi @dominicqi

I don't quite understand. Isn't this saying that the taint is tolerated? Scheduling it should be a normal situation, right? If we don't declare tolerance, everything will be scheduled to member2.

No, the policy is:

clusterTolerations:
- effect: NoSchedule
  key: workload-rebalancer-test
  operator: Exists
  tolerationSeconds: 0

In here, tolerationSeconds: 0 means we do not tolerate the workload-rebalancer-test:NoSchedule taint.

Since cluster member1 has workload-rebalancer-test:NoSchedule taint, and we do not tolerate it, so all replicas should be schedule to member2.

However, now, we still schedule replicas to member1 cluster, which means the taint is not work, and it is not expected.

I understand what you are saying. One thing I am confused about is whether this has a special meaning in Karmada? The Kubernetes documentation has such a description

https://github.com/kubernetes/kubernetes/blob/8361522b40cc8b569efdd6ee2456fa514071cad1/pkg/apis/core/types.go#L3218

TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint.