karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.51k stars 892 forks source link

How can I use scheduling based on cluster resource model to schedule job resource types between clusters? How should this tool be used? #5161

Open Schwarao opened 4 months ago

Schwarao commented 4 months ago

How can I use scheduling based on cluster resource model to schedule job resource types between clusters? How should this tool be used?

chaosi-zju commented 4 months ago

hi @Schwarao

here is a document about resource model

Schwarao commented 4 months ago

When using the following dynamic weights: weightPreference: dynamicWeight: AvailableReplicas I found that when creating a Job type task, there is a node that always fails to create a Pod I have two clusters, and the total maximum available resources (non remaining available resources) of all nodes in one cluster are greater than those in the other cluster. When I publish job type tasks, is it because the pod will always only be scheduled to the largest cluster?

Schwarao commented 4 months ago

hi @chaosi-zju The scheduling I want is to obtain the remaining available resources of all nodes in each cluster in the federation, obtain clusters that meet the requirements, filter the clusters in this step, and then consider other sorting based on factors such as the number of replicas and disaster recovery. Or is there a similar strategy in Karmada that can be implemented? Can you tell me what scheduling strategies are included in Karmada?

chaosi-zju commented 4 months ago

When using the following dynamic weights: weightPreference: dynamicWeight: AvailableReplicas I found that when creating a Job type task, there is a node that always fails to create a Pod I have two clusters, and the total maximum available resources (non remaining available resources) of all nodes in one cluster are greater than those in the other cluster. When I publish job type tasks, is it because the pod will always only be scheduled to the largest cluster?

Supposing your deployment's spec.replicas=6, you want to propagate it to member1 and member2 cluster, and you cliamed such a PropagationPolicy:

apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: nginx-propagation
spec:
  #...
  placement:
    replicaScheduling:
      replicaDivisionPreference: Weighted
      replicaSchedulingType: Divided
      weightPreference:
        dynamicWeight: AvailableReplicas 

When scheduling, the scheduler will consider the remaining available resources of two clusters, supposing the remaining available resources of member1 can accommodate 20 such replicas, while member2 cluster can accommodate 10 such replicas, then it will propagate the expected spec.replicas=6 by weight 20:10 (that is 2:1), so member1 has 4 replicas and member2 has 2 replicas.

chaosi-zju commented 4 months ago

The scheduling I want is to obtain the remaining available resources of all nodes in each cluster in the federation, obtain clusters that meet the requirements, filter the clusters in this step, and then consider other sorting based on factors such as the number of replicas and disaster recovery.

Can you give a detail example? just describe as my above AvailableReplicas example?

Schwarao commented 4 months ago

When using the following dynamic weights: weightPreference: dynamicWeight: AvailableReplicas I found that when creating a Job type task, there is a node that always fails to create a Pod I have two clusters, and the total maximum available resources (non remaining available resources) of all nodes in one cluster are greater than those in the other cluster. When I publish job type tasks, is it because the pod will always only be scheduled to the largest cluster?

Supposing your deployment's spec.replicas=6, you want to propagate it to member1 and member2 cluster, and you cliamed such a PropagationPolicy:

apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: nginx-propagation
spec:
  #...
  placement:
    replicaScheduling:
      replicaDivisionPreference: Weighted
      replicaSchedulingType: Divided
      weightPreference:
        dynamicWeight: AvailableReplicas 

When scheduling, the scheduler will consider the remaining available resources of two clusters, supposing the remaining available resources of member1 can accommodate 20 such replicas, while member2 cluster can accommodate 10 such replicas, then it will propagate the expected spec.replicas=6 by weight 20:10 (that is 2:1), so member1 has 4 replicas and member2 has 2 replicas.

If I use job or pod resource types here, and only one replica is distributed at a time, will pods be distributed to different clusters? Additionally, the scheduler you mentioned will take into account the remaining available resources of both clusters. Where did the remaining available resources come from? Is it the top?

this is ma cluster: image

this is the describe of tengxunyun: image this is the describe of teng: image

this is policy yaml: image this is job yaml: image

but i find all jobs was assigned to teng cluster

chaosi-zju commented 4 months ago

hi @Schwarao, so you have many job, each job only has one replica?

In this case, if you use AvailableReplicas, it will always propagate the replicas the cluster which has more remaining available resources. So, I suggest you use following Policy:

apiVersion: policy.karmada.io/v1alpha1
kind: ClusterPropagationPolicy
metadata:
  name: default-cpp
spec:
  placement:
    clusterAffinity:
      clusterNames:
        - member1
        - member2
    replicaScheduling:
      replicaDivisionPreference: Weighted
      replicaSchedulingType: Divided
      weightPreference:
        staticWeightList:
          - targetCluster:
              clusterNames:
                - member1
            weight: 1
          - targetCluster:
              clusterNames:
                - member2
            weight: 1

this is a static weight divided strategy, when the two cluster has same weight, the one replica is assigned to one of the two clusters with equal probability.

Schwarao commented 4 months ago

Thanks,could you please tell me ,what is the difference between PropagationPolicy and ClusterPropagationPolicy。

chaosi-zju commented 4 months ago

PropagationPolicy works with Namespace scoped resource, like deployment.

ClusterPropagationPolicy works with Namespace and NonNamespace scoped resource, so it not only can match deployment, but also match ClusterRole.

one corner case, if there are PropagationPolicy and ClusterPropagationPolicy existed and all match one deployment, when that deployment created, it would choose PropagationPolicy.

chaosi-zju commented 4 months ago

By the way, scheduler has two methods to estimator remaining resource, one is ResourceModel, the other is karmada-schedule-estimator

I would prefer you using karmada-schedule-estimator to get more accurate cluster resource information.

As for question "Where did the remaining available resources come from? Is it the top?"

RainbowMango commented 4 months ago

Thanks,could you please tell me ,what is the difference between PropagationPolicy and ClusterPropagationPolicy。

Refer to this docs: https://karmada.io/docs/faq/#what-is-the-difference-between-propagationpolicy-and-clusterpropagationpolicy