kubernetes-sigs / descheduler

Descheduler for Kubernetes
https://sigs.k8s.io/descheduler
Apache License 2.0
4.23k stars 645 forks source link

RemoveDuplicates does not working with azure aks 1.26.6 #1315

Closed edtroleis closed 7 months ago

edtroleis commented 7 months ago

What version of descheduler are you using?

descheduler version: registry.k8s.io/descheduler/descheduler:v0.28.0

Does this issue reproduce with the latest release? Yes

Which descheduler CLI options are you using? RemoveDuplicates

Please provide a copy of your descheduler policy config file

### configmap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: descheduler-policy-configmap
  namespace: kube-system
data:
  policy.yaml: |
    apiVersion: "descheduler/v1alpha2"
    kind: "DeschedulerPolicy"
    nodeSelector: "workload=apps"
    profiles:
      - name: ProfileName
        pluginConfig:
        - name: "DefaultEvictor"
        - name: "RemoveDuplicates"
          enabled: true
        plugins:
          balance:
            enabled:
              - "RemoveDuplicates"

What k8s version are you using (kubectl version)?

kubectl versionv1.28.0
$ kubectl version

What did you do?

I configure descheduler with Azure AKS 1.26.6 using:

In the deployment.yaml I set nodeSelector and decrease the frequency time

      nodeSelector:
        agentpool: systempool
      containers:
        - name: descheduler
          image: registry.k8s.io/descheduler/descheduler:v0.28.0
          imagePullPolicy: IfNotPresent
          command:
            - "/bin/descheduler"
          args:
            - "--policy-config-file"
            - "/policy-dir/policy.yaml"
            - "--descheduling-interval"
            - "1m" # "5m"
            - "--v"
            - "3"

What did you expect to see? 20 pods balanced between nodes:

What did you see instead? None was done, balancing didn't work. Pods weren't balanced between 3 nodes.

Note: I have already tried other descheduler plugins (LowNodeUtilization, PodLifeTime, etc) and they didn't work here.

Pods

NAME                               READY   STATUS    RESTARTS         AGE     NODE
deployment-745d9b78bf-2gxb4        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-2nk5x        2/2     Running   0                4h55m    aks-spotnp1-vmss000006
deployment-745d9b78bf-7tbdx        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-8qpbr        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-9r2p5        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-9x2hx        2/2     Running   0                4h55m    aks-spotnp1-vmss000006
deployment-745d9b78bf-c5prp        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-cvkrz        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-cxtfb        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-nhmv5        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-p4zcq        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-p9t55        2/2     Running   0                4h55m    aks-spotnp1-vmss000006
deployment-745d9b78bf-r2gjb        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-r78kc        2/2     Running   0                4h55m    aks-spotnp1-vmss000006
deployment-745d9b78bf-s2wd6        2/2     Running   0                4h54m    aks-spotnp1-vmss000007
deployment-745d9b78bf-t56f9        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-vgsnh        2/2     Running   0                4h54m    aks-spotnp1-vmss000007
deployment-745d9b78bf-vlrlg        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-w4zr4        2/2     Running   0                4h54m    aks-spotnp1-vmss000006
deployment-745d9b78bf-xxzpd        2/2     Running   0                4h54m    aks-spotnp1-vmss000007

Descheduler logs

I1207 18:36:23.396211 1 secure_serving.go:210] Serving securely on [::]:10258 I1207 18:36:23.396313 1 tracing.go:87] Did not find a trace collector endpoint defined. Switching to NoopTraceProvider I1207 18:36:23.396636 1 tlsconfig.go:240] "Starting DynamicServingCertificateController" I1207 18:36:23.447599 1 reflector.go:289] Starting reflector v1.Pod (0s) from k8s.io/client-go/informers/factory.go:150 I1207 18:36:23.447637 1 reflector.go:325] Listing and watching v1.Pod from k8s.io/client-go/informers/factory.go:150 I1207 18:36:23.447654 1 reflector.go:289] Starting reflector v1.PriorityClass (0s) from k8s.io/client-go/informers/factory.go:150 I1207 18:36:23.447704 1 reflector.go:325] Listing and watching v1.PriorityClass from k8s.io/client-go/informers/factory.go:150 I1207 18:36:23.447600 1 reflector.go:289] Starting reflector v1.Node (0s) from k8s.io/client-go/informers/factory.go:150 I1207 18:36:23.447839 1 reflector.go:325] Listing and watching v1.Node from k8s.io/client-go/informers/factory.go:150 I1207 18:36:23.447599 1 reflector.go:289] Starting reflector v1.Namespace (0s) from k8s.io/client-go/informers/factory.go:150 I1207 18:36:23.447942 1 reflector.go:325] Listing and watching v1.Namespace from k8s.io/client-go/informers/factory.go:150 I1207 18:36:23.949272 1 descheduler.go:156] Building a pod evictor I1207 18:36:23.949376 1 removeduplicates.go:103] "Processing node" node="aks-spotnp1-vmss000007" I1207 18:36:23.949521 1 removeduplicates.go:103] "Processing node" node="aks-spotnp1-vmss000006" I1207 18:36:23.949716 1 removeduplicates.go:103] "Processing node" node="aks-spotnp1-vmss000008" I1207 18:36:23.949880 1 profile.go:356] "Total number of pods evicted" extension point="Balance" evictedPods=0 I1207 18:36:23.949916 1 descheduler.go:170] "Number of evicted pods" totalEvicted=0 I1207 18:37:23.949565 1 descheduler.go:156] Building a pod evictor I1207 18:37:23.949679 1 removeduplicates.go:103] "Processing node" node="aks-spotnp1-vmss000007" I1207 18:37:23.949914 1 removeduplicates.go:103] "Processing node" node="aks-spotnp1-vmss000006" I1207 18:37:23.950161 1 removeduplicates.go:103] "Processing node" node="aks-spotnp1-vmss000008" I1207 18:37:23.950345 1 profile.go:356] "Total number of pods evicted" extension point="Balance" evictedPods=0 I1207 18:37:23.950381 1 descheduler.go:170] "Number of evicted pods" totalEvicted=0

AKS nodes

NAME                                 STATUS   ROLES   AGE     VERSION
aks-spotnp1-vmss000006      Ready    agent   5h10m   v1.26.6
aks-spotnp1-vmss000007      Ready    agent   5h9m    v1.26.6
aks-spotnp1-vmss000008      Ready    agent   5h7m    v1.26.6
edtroleis commented 7 months ago

I noticed tha my deployment/pods have volumes associated with them and I did the configuration below:

apiVersion: "descheduler/v1alpha2"
kind: "DeschedulerPolicy"
nodeSelector: "workload=apps"
profiles:
  - name: ProfileName
    pluginConfig:
    - name: DefaultEvictor"
      args:
          evictLocalStoragepods: true
          nodeFit: true
    - name: "RemoveDuplicates"
    plugins:
      balance:
        enabled:
          - "RemoveDuplicates"

If there is a better practice to configure it, please shared with me.