kubernetes-sigs / descheduler

Descheduler for Kubernetes
https://sigs.k8s.io/descheduler
Apache License 2.0
4.23k stars 645 forks source link

Pods strategies don't work #1341

Open y0zg opened 5 months ago

y0zg commented 5 months ago

descheduler version: v0.27.1 helm chart version: 0.27.1

Does this issue reproduce with the latest release? Haven't tested

Which descheduler CLI options are you using?

    Command:
      /bin/descheduler
    Args:
      --policy-config-file
      /policy-dir/policy.yaml
      --v
      3

Please provide a copy of your descheduler policy config file

helm values file

cronJobApiVersion: "batch/v1"  # Use "batch/v1beta1" for k8s version < 1.21.0. TODO(@7i) remove with 1.23 release
schedule: "0 9 * * 1-4"
suspend: false

deschedulerPolicy:
  profiles:
  - name: ProfileName
    pluginConfig:
    - name: "DefaultEvictor"
      args:
        evictFailedBarePods: true
        nodeFit: true
    - name: "PodLifeTime"
      args:
        maxPodLifeTimeSeconds: 3600
        states:
        - "ContainerStatusUnknown"
        - "Completed"
        - "Error"
        - "Evicted"
    - name: "RemoveFailedPods"
      args:
        # evictableNamespaces:
        #   exclude:
        #   - "kube-system"
        #   - "jenkins"
        reasons:
        # - "OutOfcpu"
        - "CreateContainerConfigError"
        - "Error"
        - "Completed"
        - "Evicted"
        - "ContainerStatusUnknown"
        # excludeOwnerKinds:
        # - "Job"
        minPodLifetimeSeconds: 3600
    plugins:
      deschedule:
        enabled:
          - "RemoveFailedPods"
          - "PodLifeTime"
  # nodeSelector: "key1=value1,key2=value2"
  # maxNoOfPodsToEvictPerNode: 10
  # maxNoOfPodsToEvictPerNamespace: 10
  ignorePvcPods: true
  # evictLocalStoragePods: true
  strategies:
    RemoveFailedPods:
      enabled: true
      params:
        failedPods:
          reasons:
          - "Failed"
          # - "Succeeded"
    LowNodeUtilization:
      enabled: false
    HighNodeUtilization:
      enabled: true
      params:
        nodeResourceUtilizationThresholds:
          thresholds:
            cpu: 20
            memory: 20
        namespaces:
          exclude:
          - "kube-system"
          - "jenkins"
          - "logs"

What k8s version are you using (kubectl version)?

k8s version: v1.27.8-eks-8cb36c9

I don't see that descheduler removes pods with status Error, ContainerStatusUnknown, Completed

damemi commented 5 months ago

Hi @y0zg, are you seeing descheduler remove pods with other status codes besides Error, ContainerStatusUnknown, and Completed? And are the pods with these statuses at least 3600 seconds old as your minPodLifetimeSeconds

@knelasevero @a7i @ingvagabund I don't remember exactly how we check pod eviction reasons/statuses. Did we have any updated docs on what exactly is covered?

y0zg commented 5 months ago

The problem that I don't see that descheduler removes pods at all. I tried to set lifetime to 60sec with frequent cronjob run.

One more interesting fact , in log I see this warning while running 0.27 version on 1.27 k8s version

W0118 16:51:01.043860       1 descheduler.go:127] Warning: Descheduler minor version 27 is not supported on your version of Kubernetes 1.27+. See compatibility docs for more info: https://github.com/kubernetes-sigs/descheduler#compatibility-matrix

I tried to use descheduler 0.28 version and with it above compatibility warning isn't present

below is helm template for configmap

---
# Source: descheduler/templates/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: release-name-descheduler
  namespace: default
  labels:
    app.kubernetes.io/name: descheduler
    helm.sh/chart: descheduler-0.27.1
    app.kubernetes.io/instance: release-name
    app.kubernetes.io/version: "0.27.1"
    app.kubernetes.io/managed-by: Helm
data:
  policy.yaml: |
    apiVersion: "descheduler/v1alpha1"
    kind: "DeschedulerPolicy"
    ignorePvcPods: true
    profiles:
    - name: ProfileName
      pluginConfig:
      - args:
          maxPodLifeTimeSeconds: 60
          states:
          - ContainerStatusUnknown
          - Completed
          - Error
          - Evicted
        name: PodLifeTime
      - args:
          minPodLifetimeSeconds: 60
          reasons:
          - CreateContainerConfigError
          - Error
          - Completed
          - Evicted
          - ContainerStatusUnknown
        name: RemoveFailedPods
      plugins:
        deschedule:
          enabled:
          - RemoveFailedPods
    strategies:
      HighNodeUtilization:
        enabled: true
        params:
          namespaces:
            exclude:
            - kube-system
            - jenkins
            - logs
          nodeResourceUtilizationThresholds:
            thresholds:
              cpu: 20
              memory: 20
      LowNodeUtilization:
        enabled: false
        params:
          nodeResourceUtilizationThresholds:
            targetThresholds:
              cpu: 50
              memory: 50
              pods: 50
            thresholds:
              cpu: 20
              memory: 20
              pods: 20
      RemoveDuplicates:
        enabled: true
      RemoveFailedPods:
        enabled: true
        params:
          failedPods:
            reasons:
            - Failed
      RemovePodsHavingTooManyRestarts:
        enabled: true
        params:
          podsHavingTooManyRestarts:
            includingInitContainers: true
            podRestartThreshold: 100
      RemovePodsViolatingInterPodAntiAffinity:
        enabled: true
      RemovePodsViolatingNodeAffinity:
        enabled: true
        params:
          nodeAffinityType:
          - requiredDuringSchedulingIgnoredDuringExecution
      RemovePodsViolatingNodeTaints:
        enabled: true
      RemovePodsViolatingTopologySpreadConstraint:
        enabled: true
        params:
          includeSoftConstraints: false
a7i commented 5 months ago

The warning message about compatibility was addressed here which landed in v0.28.1 and v0.29.0 but not v0.27.x

As far as your second question, RemoveFailedPods only looks at pod phase reason, and not container status. For example CreateContainerConfigError is not a Pod Phase, but rather a Container Status.

Additionally, reason Completed will not work with RemoveFailedPods because this strategy only looks at pods in Failed phase. Pod phase for Completed is always Succeeded.

All of that to say that you may want to use PodLifeTime instead since it checks both pod status reasons and container status reasons (ref).

Question for maintainers - we should either merge the two strategies and formally retire RemoveFailedPods or be consistent in the reasons that we check (check pod status and container status in both strategies).

ingvagabund commented 5 months ago

Question for maintainers - we should either merge the two strategies and formally retire RemoveFailedPods

Similar request for merging strategies is in https://github.com/kubernetes-sigs/descheduler/issues/1169. Worth extending it with RemoveFailedPods for further analysis and discussion.

y0zg commented 5 months ago

Since I shared the config (for sure it might contain incorrect settings as I was testing various options), can you check what should be corrected in order to delete "failed" pods, i.e. pods with status: Error

Much appreciate!

HaveFun83 commented 4 months ago

The warning message about compatibility was addressed here which landed in v0.28.1 and v0.29.0 but not v0.27.x

As far as your second question, RemoveFailedPods only looks at pod phase reason, and not container status. For example CreateContainerConfigError is not a Pod Phase, but rather a Container Status.

Additionally, reason Completed will not work with RemoveFailedPods because this strategy only looks at pods in Failed phase. Pod phase for Completed is always Succeeded.

All of that to say that you may want to use PodLifeTime instead since it checks both pod status reasons and container status reasons (ref).

Question for maintainers - we should either merge the two strategies and formally retire RemoveFailedPods or be consistent in the reasons that we check (check pod status and container status in both strategies).

I stumbled upon this as i want to cleanup some "Completed" Pods which are not owned by jobs. First try with "RemoveFailedPods" ignored the "Completed" pods as you mentioned. Maybe an hint in the plugin documentation make this clear. Second try using "PodLifeTime" shows me the following errors:

          - name: "PodLifeTime"
            args:
              maxPodLifeTimeSeconds: 604800 # 7 days
              states:
              - "Completed"     
E0209 11:23:10.180425       1 server.go:96] "descheduler server" err="in profile ProfileName: states must be one of [Running Pending PodInitializing ContainerCreating ImagePullBackOff]"
E0209 11:23:10.180523       1 run.go:74] "command failed" err="in profile ProfileName: states must be one of [Running Pending PodInitializing ContainerCreating ImagePullBackOff]"
          - name: "PodLifeTime"
            args:
              maxPodLifeTimeSeconds: 604800 # 7 days
              podStatusPhases:
              - "Completed"
E0209 11:26:43.971091       1 server.go:96] "descheduler server" err="failed decoding descheduler's policy config \"/policy-dir/policy.yaml\": strict decoding error: unknown field \"podStatusPhases\""
E0209 11:26:43.971142       1 run.go:74] "command failed" err="failed decoding descheduler's policy config \"/policy-dir/policy.yaml\": strict decoding error: unknown field \"podStatusPhases\""

descheduler:v0.29.0 k8s: v1.28.6

anything else i can try?

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 3 weeks ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten