kubernetes-sigs / descheduler

Descheduler for Kubernetes
https://sigs.k8s.io/descheduler
Apache License 2.0
4.24k stars 645 forks source link

nodeaffinity and topologySpreadConstraints together result in balanced pods being evicted #1280

Closed 5nafu closed 6 months ago

5nafu commented 8 months ago

What version of descheduler are you using?

descheduler version: v0.28.0 as deployment

Does this issue reproduce with the latest release?

Yes

Which descheduler CLI options are you using?

- --policy-config-file=/policy-dir/policy.yaml
- --descheduling-interval=2m
- --v=4

Please provide a copy of your descheduler policy config file

apiVersion: "descheduler/v1alpha2"
kind: "DeschedulerPolicy"
profiles:
- name: default
  pluginConfig:
  - args:
      evictFailedBarePods: false
      evictLocalStoragePods: true
      evictSystemCriticalPods: false
      ignorePvcPods: false
      nodeFit: false
    name: DefaultEvictor
  - name: RemovePodsViolatingInterPodAntiAffinity
  - name: RemovePodsViolatingNodeTaints
  - name: RemovePodsViolatingTopologySpreadConstraint
  - args:
      podRestartThreshold: 50
    name: RemovePodsHavingTooManyRestarts
  plugins:
    balance:
      enabled:
      - RemovePodsViolatingTopologySpreadConstraint
    deschedule:
      enabled:
      - RemovePodsViolatingInterPodAntiAffinity
      - RemovePodsViolatingNodeTaints
      - RemovePodsHavingTooManyRestarts

What k8s version are you using (kubectl version)?

kubectl version Output
$ kubectl version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.25.14-eks-f8587cb

What did you do? We are running an AWS EKS cluster with karpenter for dynamic node scheduling. When creating a deployment limiting the availability zones being allowed together with topologySpreadConstraints, the deployment comes up balanced.

Deployment example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: echoserver-https
spec:
  replicas: 9
  selector:
    matchLabels:
      app: echoserver-https
  template:
    metadata:
      labels:
        app: echoserver-https
    spec:
      # ...
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - eu-central-1a
                - eu-central-1b
      topologySpreadConstraints:
      - labelSelector:
          matchLabels:
            app: echoserver-https
        maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule

As we are using dynamic node creation via karpenter, we decided not to activate NodeFit, as there will/should never be "spare" capacity and the descheduler would never evict pods.

What did you expect to see? The workload deployment being stable and balanced

What did you see instead? The descheduler is evicting pods on every run even though the workload deployment is balanced. Upping the loglevel does not show any more insights:

I1101 13:49:51.629315       1 evictions.go:171] "Evicted pod" pod="echoserver/echoserver-https-6dc4d884ff-kxk6d" reason="" strategy="RemovePodsViolatingTopologySpreadConstraint" node="ip-172-16-79-239.eu-central-1.compute.internal"
I1101 13:49:51.683278       1 evictions.go:171] "Evicted pod" pod="echoserver/echoserver-https-6dc4d884ff-hg6rg" reason="" strategy="RemovePodsViolatingTopologySpreadConstraint" node="ip-172-16-79-239.eu-central-1.compute.internal"
I1101 13:49:51.708028       1 evictions.go:171] "Evicted pod" pod="echoserver/echoserver-https-6dc4d884ff-hrzsw" reason="" strategy="RemovePodsViolatingTopologySpreadConstraint" node="ip-172-16-58-219.eu-central-1.compute.internal"
I1101 13:49:51.708080       1 profile.go:356] "Total number of pods evicted" extension point="Balance" evictedPods=3
I1101 13:49:51.708098       1 descheduler.go:170] "Number of evicted pods" totalEvicted=3
a7i commented 8 months ago

This is the PR that added support in Descheduler: https://github.com/kubernetes-sigs/descheduler/pull/1218 We are in the process of cutting a new release (v0.28.1) and that may get included in the new release.

a7i commented 6 months ago

Resolved in v0.28.1

Please let us know if you face any issues