volcano-sh / volcano

A Cloud Native Batch System (Project under CNCF)
https://volcano.sh
Apache License 2.0
4.16k stars 955 forks source link

Volcano Uneven scheduling #3773

Open kinoxyz1 opened 5 days ago

kinoxyz1 commented 5 days ago

Please describe your problem in detail

my k8s node is as follows:

NAME                 STATUS   ROLES                  AGE    VERSION
prod.ds.03.idc   Ready    control-plane,master   350d   v1.21.9
prod.ds.04.idc   Ready    control-plane,master   350d   v1.21.9
prod.ds.05.idc   Ready    control-plane,master   350d   v1.21.9
prod.ds.06.idc   Ready    <none>                 350d   v1.21.9
prod.ds.07.idc   Ready    <none>                 350d   v1.21.9
prod.ds.08.idc   Ready    <none>                 350d   v1.21.9
prod.ds.12.idc   Ready    <none>                 221d   v1.21.9
prod.ds.13.idc   Ready    <none>                 221d   v1.21.9
prod.ds.14.idc   Ready    <none>                 221d   v1.21.9

each node is 16c 64g, the master setting is not schedulable, and the label node=realtime is set for the three nodes prod.ds.12.idc、prod.ds.12.idc、prod.ds.14.idc

kubectl get node --show-labels
NAME                 STATUS   ROLES                  AGE    VERSION   LABELS
prod.ds.03.idc   Ready    control-plane,master   350d   v1.21.9   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=prod.ds.03.idc,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=
prod.ds.04.idc   Ready    control-plane,master   350d   v1.21.9   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=prod.ds.04.idc,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=
prod.ds.05.idc   Ready    control-plane,master   350d   v1.21.9   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=prod.ds.05.idc,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=
prod.ds.06.idc   Ready    <none>                 350d   v1.21.9   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=prod.ds.06.idc,kubernetes.io/os=linux,node=offline
prod.ds.07.idc   Ready    <none>                 350d   v1.21.9   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=prod.ds.07.idc,kubernetes.io/os=linux,node=offline
prod.ds.08.idc   Ready    <none>                 350d   v1.21.9   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=prod.ds.08.idc,kubernetes.io/os=linux,node=offline
prod.ds.12.idc   Ready    <none>                 221d   v1.21.9   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=prod.ds.12.idc,kubernetes.io/os=linux,node=realtime
prod.ds.13.idc   Ready    <none>                 221d   v1.21.9   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=prod.ds.13.idc,kubernetes.io/os=linux,node=realtime
prod.ds.14.idc   Ready    <none>                 221d   v1.21.9   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=prod.ds.14.idc,kubernetes.io/os=linux,node=realtime

I created a queue:

apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
  name: realtime
spec:
  capability:
    cpu: "48"
    memory: 192Gi
  weight: 1

I try to use scheduled tasks to trigger many Kubernetes Deployments at the same time, so that they can be deployed and run in the realtime queue. however, it was found that the pods were not evenly distributed on the three nodes.

$kubectl get pod -o wide | grep prod.ds.12.idc | wc -l
27
$kubectl get pod -o wide | grep prod.ds.13.idc | wc -l
22
$kubectl get pod -o wide | grep prod.ds.14.idc | wc -l
5

This is the Deployment yaml:

apiVersion: apps/v1
kind: Deployment
...
spec:
  selector:
    matchLabels:
      app: flinksync-4pi1nof99gi3-63o0fd9ocmyr78ch
      component: jobmanager
      type: flink-native-kubernetes
  template:
    metadata:
      annotations:
        scheduling.k8s.io/group-name: pg-flinksync-4pi1nof99gi3-63o0fd9ocmyr78ch
      labels:
        app: flinksync-4pi1nof99gi3-63o0fd9ocmyr78ch
        component: jobmanager
        query-tag: flinksync-4pi1nof99gi3
        type: flink-native-kubernetes
      name: pod-template
    spec:
      ...
      containers:
      nodeSelector:
        node: realtime
      schedulerName: volcano
      ...

This is the volcano scheduler conf:

  volcano-scheduler.conf: |
    actions: "enqueue, allocate, backfill"
    tiers:
    - plugins:
      - name: priority
      - name: gang
        enablePreemptable: false
      - name: conformance
    - plugins:
      - name: overcommit
      - name: drf
        enablePreemptable: false
      - name: predicates
      - name: proportion
      - name: nodeorder
      - name: binpack

This scheduling puts a lot of pressure on two nodes, while some nodes are idle. How can we make the scheduling more balanced?

Any other relevant information

Kubernetes version: v1.21.9 os: CentOS Linux release 7.9.2009 (Core)

hwdef commented 5 days ago

because binpack algorithm. https://volcano.sh/en/docs/plugins/#binpack

kinoxyz1 commented 5 days ago

because binpack algorithm. https://volcano.sh/en/docs/plugins/#binpack

So if I delete the binpack configuration in volcano schedule conf, can the scheduling be more uniform on each node?

  volcano-scheduler.conf: |
    actions: "enqueue, allocate, backfill"
    tiers:
    - plugins:
      - name: priority
      - name: gang
        enablePreemptable: false
      - name: conformance
    - plugins:
      - name: overcommit
      - name: drf
        enablePreemptable: false
      - name: predicates
      - name: proportion
      - name: nodeorder
      - name: binpack    ## delete this
hwdef commented 5 days ago

yes, you are right

kinoxyz1 commented 5 days ago

yes, you are right

thank you

Monokaix commented 4 days ago

Have your problem been solved?