kubernetes-sigs / descheduler

Descheduler for Kubernetes
https://sigs.k8s.io/descheduler
Apache License 2.0
4.24k stars 645 forks source link

Helm chart does not work with v1alpha2 policy #1268

Closed 5nafu closed 7 months ago

5nafu commented 8 months ago

What version of descheduler are you using?

descheduler version: v0.28.0 Helm chart: descheduler-0.28.0

Does this issue reproduce with the latest release?

Yes

Which descheduler CLI options are you using?

--policy-config-file=/policy-dir/policy.yaml
--descheduling-interval=2m
--v=4

Please provide a copy of your descheduler policy config file

What k8s version are you using (kubectl version)?

kubectl version Output
$ kubectl version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.25.14-eks-f8587cb
WARNING: version difference between client (1.28) and server (1.25) exceeds the supported minor version skew of +/-1

What did you do? When deploying descheduler via Helm chart and using the policy version 1alpha2 (see below for the values.yaml), the default strategy configuration will be added to the rendered configmap (also below).

values.yaml
cmdOptions:
  v: 4
deschedulerPolicy:
  profiles:
  - name: mEKS-default
    pluginConfig:
    - args:
        evictFailedBarePods: false
        evictLocalStoragePods: true
        evictSystemCriticalPods: false
        ignorePvcPods: false
        nodeFit: false
      name: DefaultEvictor
    - name: RemovePodsViolatingInterPodAntiAffinity
    - name: RemovePodsViolatingNodeTaints
    - args:
        podRestartThreshold: 50
      name: RemovePodsHavingTooManyRestarts
    - args:
        topologyBalanceNodeFit: false
      name: RemovePodsViolatingTopologySpreadConstraint
    plugins:
      balance:
      - RemovePodsViolatingTopologySpreadConstraint
      deschedule:
      - RemovePodsViolatingInterPodAntiAffinity
      - RemovePodsViolatingNodeTaints
      - RemovePodsHavingTooManyRestarts
deschedulerPolicyAPIVersion: descheduler/v1alpha2
deschedulingInterval: 2m
kind: Deployment
resources:
  limits:
    cpu: 1024m
    memory: 512Mi
  requests:
    cpu: 500m
    memory: 256Mi
securityContext:
  seccompProfile:
    type: RuntimeDefault
service:
  enabled: true
serviceMonitor:
  enabled: true
configmap data/policy.yaml
apiVersion: "descheduler/v1alpha2"
kind: "DeschedulerPolicy"
profiles:
- name: mEKS-default
  pluginConfig:
  - args:
      evictFailedBarePods: false
      evictLocalStoragePods: true
      evictSystemCriticalPods: false
      ignorePvcPods: false
      nodeFit: false
    name: DefaultEvictor
  - name: RemovePodsViolatingInterPodAntiAffinity
  - name: RemovePodsViolatingNodeTaints
  - args:
      podRestartThreshold: 50
    name: RemovePodsHavingTooManyRestarts
  - args:
      topologyBalanceNodeFit: false
    name: RemovePodsViolatingTopologySpreadConstraint
  plugins:
    balance:
    - RemovePodsViolatingTopologySpreadConstraint
    deschedule:
    - RemovePodsViolatingInterPodAntiAffinity
    - RemovePodsViolatingNodeTaints
    - RemovePodsHavingTooManyRestarts
strategies:
  LowNodeUtilization:
    enabled: true
    params:
      nodeResourceUtilizationThresholds:
        targetThresholds:
          cpu: 50
          memory: 50
          pods: 50
        thresholds:
          cpu: 20
          memory: 20
          pods: 20
  RemoveDuplicates:
    enabled: true
  RemovePodsHavingTooManyRestarts:
    enabled: true
    params:
      podsHavingTooManyRestarts:
        includingInitContainers: true
        podRestartThreshold: 100
  RemovePodsViolatingInterPodAntiAffinity:
    enabled: true
  RemovePodsViolatingNodeAffinity:
    enabled: true
    params:
      nodeAffinityType:
      - requiredDuringSchedulingIgnoredDuringExecution
  RemovePodsViolatingNodeTaints:
    enabled: true
  RemovePodsViolatingTopologySpreadConstraint:
    enabled: true
    params:
      includeSoftConstraints: false

What did you expect to see? A valid configmap for the descheduler

What did you see instead? The descheduler failing to start with the errormessage:

E1018 17:09:41.237490       1 server.go:101] "descheduler server" err="failed decoding descheduler's policy config \"/policy-dir/policy.yaml\": json: cannot unmarshal array into Go struct field Plugins.profiles.plugins.balance of type v1alpha2.PluginSet"
5nafu commented 8 months ago

Update PEBKAC: I was missing an enabled level on the plugin configuration. The pods starts now and the superfluous strategies in the configmap are accepted/ignored.

a7i commented 8 months ago

while we want on https://github.com/kubernetes-sigs/descheduler/pull/1139

you could add this to your values.yaml:

deschedulerPolicy:
  strategies:
a7i commented 8 months ago

/kind support

ishworgurung commented 8 months ago

FWIW, this works presently:

# https://github.com/kubernetes-sigs/descheduler/blob/master/charts/descheduler/values.yaml
---
kind: CronJob

cmdOptions:
  v: 4

deschedulerPolicyAPIVersion: "descheduler/v1alpha2"

deschedulingInterval: 2m

deschedulerPolicy:
  profiles:
    - name: default
      pluginConfig:
        - name: RemoveDuplicates
      plugins:
        balance:
          enabled:
            - RemoveDuplicates
  strategies:

thanks @a7i

a7i commented 7 months ago

Awesome. Please feel free to reopen if you run into any issues /close

k8s-ci-robot commented 7 months ago

@a7i: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/descheduler/issues/1268#issuecomment-1832445578): >Awesome. Please feel free to reopen if you run into any issues >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.