RemoveDuplicates doesn't work as expected

jia2 commented 4 years ago

Here my kubernetes version

kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:40:16Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.9-eks-c0eccc", GitCommit:"c0eccca51d7500bb03b2f163dd8d534ffeb2f7a2", GitTreeState:"clean", BuildDate:"2019-12-22T23:14:11Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

version of descheduler

descheduler version
Descheduler version {Major:0 Minor:10+ GitCommit:eff8185d7ca801033ee593486fc221c19fd41b42 GitVersion:v0.10.0-2-geff8185d7 BuildDate:2020-02-17T13:17:21+0000 GoVersion:go1.13.4 Compiler:gc Platform:linux/amd64}

I have duplicated pods in one worker node and run descheduler with version v0.10.0 as a job, however, it didn't work as expected. I set the log level to 9 and checked the logs and found out the response of request doesn't contain the list of complete pods in the queried node. For example, for the request https://172.20.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dip-10-107-193-192.eu-central-1.compute.internal%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded there are only two pods in the response, but actually there are 39 pods running in this node.

But In the logs I can find the response with http 200 status code. I0218 14:55:44.684433 1 duplicates.go:50] Processing node: "ip-10-107-193-192.eu-central-1.compute.internal" I0218 14:55:44.707379 1 round_trippers.go:443] GET https://172.20.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dip-10-107-193-192.eu-central-1.compute.internal%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded 200 OK in 22 milliseconds

I'm not sure, if the pods list is completely written in log or not.

When I run kubectl proxy on my localmachine und open url http://localhost:8001/api/v1/pods?fieldSelector=spec.nodeName=ip-10-107-193-192.eu-central-1.compute.internal,status.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded in a browser, it took definitive much longer than 22 milliseconds, but I got the complete list with 39 pods. Can I increase the timeout value for this request to wait a little longer?

Thanks

seanmalloy commented 4 years ago

/kind bug

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

seanmalloy commented 4 years ago

@jia2 can you try the latest release, v0.18.0, and see if you see the same behavior?

jia2 commented 4 years ago

@seanmalloy I tried with v0.18.0. It doesn't work yet. Here are loads of work nodes in my k8s cluster: [jia@10-105-21-115 (⎈ |abn:kube-system)] ~ $ k top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% ip-10-107-208-158.eu-central-1.compute.internal 1392m 35% 12238Mi 43% ip-10-107-208-54.eu-central-1.compute.internal 1138m 29% 18999Mi 67% ip-10-107-208-9.eu-central-1.compute.internal 1809m 46% 17296Mi 61% ip-10-107-209-10.eu-central-1.compute.internal 1275m 32% 19591Mi 69% ip-10-107-209-14.eu-central-1.compute.internal 1159m 29% 20616Mi 73% ip-10-107-209-65.eu-central-1.compute.internal 1815m 46% 18039Mi 64% ip-10-107-210-10.eu-central-1.compute.internal 1425m 36% 20906Mi 74% ip-10-107-210-110.eu-central-1.compute.internal 1565m 39% 17132Mi 60%

for example, on worker node ip-10-107-208-54.eu-central-1.compute.internal I can see 3 deployments with duplicates pod are running. After running descheduler job, those pods are still running on the same worker node.

I checked the log of the job, the work node was processed by descheduler, 11:51:57.466555 1 duplicates.go:49] Processing node: "ip-10-107-208-54.eu-central-1.compute.internal" And the request below also got a response of PodList, however it is truncated due to too long text in log 11:51:57.498850 1 round_trippers.go:443] GET https://172.20.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dip-10-107-208-54.eu-central-1.compute.internal%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded 200 OK in 32 milliseconds

I'm not sure, what did I wrong, that the strategy "RemoveDuplicates" doesn't work for me.

seanmalloy commented 4 years ago

/remove-lifecycle stale

@jia2 please provide the below information, so that we can continue to assist with troubleshooting.

descheduler CLI options you are using
descheduler policy ConfigMap
namespace that the pods in question are running in
kubectl describe pod output for all of the pods in question
k8s version you are using

jia2 commented 4 years ago

descheduler CLI option

      command:
        - "/bin/descheduler"
      args:
        - "--policy-config-file"
        - "/policy-dir/policy.yaml"
        - "--v"
        - "9"

descheduler policy ConfigMap

apiVersion: "descheduler/v1alpha1"
kind: "DeschedulerPolicy"
strategies:
  "RemoveDuplicates":
     enabled: true
  "RemovePodsViolatingInterPodAntiAffinity":
     enabled: true
  "LowNodeUtilization":
     enabled: true
     params:
       nodeResourceUtilizationThresholds:
         thresholds:
           "cpu" : 20
           "memory": 20
           "pods": 20
         targetThresholds:
           "cpu" : 70
           "memory": 70
           "pods": 50

namespace that the pods in question are running in It is in a custom namespace, cloudia-abn
kubectl describe pod output

k describe pod ppr-timetableout-camadapter-v01-deploy-7589b4858c-fkgmm

Name:                 ppr-timetableout-camadapter-v01-deploy-7589b4858c-fkgmm
Namespace:            cloudia-abn
Priority:             1000600
Priority Class Name:  ppr-priorityclass
Node:                 ip-10-107-209-28.eu-central-1.compute.internal/10.107.209.28
Start Time:           Thu, 11 Jun 2020 01:20:18 +0200
Labels:               app=cloudia
                      creation_date=08.06.20_1535
                      docker_image_tag=OW_AMQ_WMQ_LOOP-v0.1.1624-SADe_06_2020
                      environment=cloudia-abn
                      name=ppr-timetableout-camadapter-v01-pod
                      pod-template-hash=7589b4858c
                      serviceversion=ppr-timetableout-camadapter-v01_1.3.3_3573
                      svc=ppr-timetableout-camadapter-v01
Annotations:          kubernetes.io/psp: eks.privileged
Status:               Running
IP:                   10.107.209.15
Controlled By:        ReplicaSet/ppr-timetableout-camadapter-v01-deploy-7589b4858c
Init Containers:
  ppr-timetableout-camadapter-v01-init:
    Container ID:  docker://cc56c328a4b637cdbfc9fc3e1eeaa395bff63d6a8377b980d6092fef9c403bff
    Image:         368971480733.dkr.ecr.eu-central-1.amazonaws.com/cloudia/aws-cli:1.18
    Image ID:      docker-pullable://368971480733.dkr.ecr.eu-central-1.amazonaws.com/cloudia/aws-cli@sha256:37ff2cda184f684c87732cd4d6fc5dd263bac99a149a00e789f7244f99c09b81
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -c
      aws s3 cp --quiet s3://$(S3_BUCKET)/$(S3_SERVICE_FOLDER) . --recursive;
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 11 Jun 2020 01:20:38 +0200
      Finished:     Thu, 11 Jun 2020 01:20:45 +0200
    Ready:          True
    Restart Count:  0
    Environment:
      S3_SERVICE_FOLDER:  cloudia-abn/services/ppr-timetableout-camadapter-v01
      S3_BUCKET:          368971480733-cloudia-deploy
    Mounts:
      /project from configdir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-8rts8 (ro)
Containers:
  ppr-timetableout-camadapter-v01:
    Container ID:  docker://517b0e94943d12e575369e88e7a3f6f74c0808d5882b33c87bf5ec0a01fc4b5c
    Image:         368971480733.dkr.ecr.eu-central-1.amazonaws.com/cloudia/ow_amq_wmq_loop:OW_AMQ_WMQ_LOOP-v0.1.1624-SADe_06_2020
    Image ID:      docker-pullable://368971480733.dkr.ecr.eu-central-1.amazonaws.com/cloudia/ow_amq_wmq_loop@sha256:a27e221ef8d6a4840a060eec390727f88e0927e33ebbe03766e7ff45eab607fc
    Ports:         8443/TCP, 8084/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /bin/sh
      -c
      cd /opt/cloudia/ppr/ppr-timetableout-camadapter-v01; java -DSERVICE_CONTAINER=ppr-timetableout-camadapter-v01 -Dspring.profiles.active=cloudia-abn -Dcom.ibm.mq.cfg.useIBMCipherMappings=false -Dspring.config.location=file:/opt/config/app-config.yaml,file:/opt/config/env.yaml,file:/opt/cloudia-config-secret/cloudia-config-secret.yaml,file:/opt/mq-config-secret/mq-connection.yaml,classpath:config/application-generic-aws.yml,classpath:config/application-utils-aws.yml -Xms1000m -Xmx1000m -jar /opt/ow-amq-wmq-loop.jar
    State:          Running
      Started:      Thu, 11 Jun 2020 01:21:32 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     400m
      memory:  1000Mi
    Requests:
      cpu:      100m
      memory:   500Mi
    Liveness:   http-get http://:8084/PPR/Timetableout/CAMAdapter/V01/health delay=200s timeout=10s period=80s #success=1 #failure=5
    Readiness:  http-get http://:8084/PPR/Timetableout/CAMAdapter/V01/health delay=220s timeout=5s period=50s #success=1 #failure=3
    Environment:
      STAKATER_PPR_TIMETABLEOUT_CAMADAPTER_V01_CM_CONFIGMAP:  0f49f7a571f36e29f7482eff42e568097fd377e5
      STAKATER_MQ_CONFIG_SECRET_SECRET:                       a715da5e437d474638e45bafc8ba7d755da423a5
      STAKATER_CLOUDIA_TRUSTSTORE_CM_CONFIGMAP:               da39a3ee5e6b4b0d3255bfef95601890afd80709
      STAKATER_CLOUDIA_CONFIG_SECRET_SECRET:                  52370aed941e0d62c296351cfe5d258b8128949f
    Mounts:
      /etc/certs from certs-volume (rw)
      /opt/cloudia-config-secret from cloudia-config-volume (ro)
      /opt/cloudia/ppr/ppr-timetableout-camadapter-v01 from configdir (rw)
      /opt/config from appconfig-volume (rw)
      /opt/mq-config-secret from mq-config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-8rts8 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  configdir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  appconfig-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ppr-timetableout-camadapter-v01-cm
    Optional:  false
  certs-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      cloudia-truststore-cm
    Optional:  false
  cloudia-config-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cloudia-config-secret
    Optional:    false
  mq-config-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  mq-config-secret
    Optional:    false
  default-token-8rts8:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-8rts8
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

k describe pod ppr-timetableout-camadapter-v01-deploy-7589b4858c-f4gmg

Name:                 ppr-timetableout-camadapter-v01-deploy-7589b4858c-f4gmg
Namespace:            cloudia-abn
Priority:             1000600
Priority Class Name:  ppr-priorityclass
Node:                 ip-10-107-209-28.eu-central-1.compute.internal/10.107.209.28
Start Time:           Thu, 11 Jun 2020 01:20:21 +0200
Labels:               app=cloudia
                      creation_date=08.06.20_1535
                      docker_image_tag=OW_AMQ_WMQ_LOOP-v0.1.1624-SADe_06_2020
                      environment=cloudia-abn
                      name=ppr-timetableout-camadapter-v01-pod
                      pod-template-hash=7589b4858c
                      serviceversion=ppr-timetableout-camadapter-v01_1.3.3_3573
                      svc=ppr-timetableout-camadapter-v01
Annotations:          kubernetes.io/psp: eks.privileged
Status:               Running
IP:                   10.107.209.247
Controlled By:        ReplicaSet/ppr-timetableout-camadapter-v01-deploy-7589b4858c
Init Containers:
  ppr-timetableout-camadapter-v01-init:
    Container ID:  docker://085cc93a79301ef1a8cca273db5d177802a3856c2b645497dece008c78d16c40
    Image:         368971480733.dkr.ecr.eu-central-1.amazonaws.com/cloudia/aws-cli:1.18
    Image ID:      docker-pullable://368971480733.dkr.ecr.eu-central-1.amazonaws.com/cloudia/aws-cli@sha256:37ff2cda184f684c87732cd4d6fc5dd263bac99a149a00e789f7244f99c09b81
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -c
      aws s3 cp --quiet s3://$(S3_BUCKET)/$(S3_SERVICE_FOLDER) . --recursive;
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 11 Jun 2020 01:20:33 +0200
      Finished:     Thu, 11 Jun 2020 01:20:43 +0200
    Ready:          True
    Restart Count:  0
    Environment:
      S3_SERVICE_FOLDER:  cloudia-abn/services/ppr-timetableout-camadapter-v01
      S3_BUCKET:          368971480733-cloudia-deploy
    Mounts:
      /project from configdir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-8rts8 (ro)
Containers:
  ppr-timetableout-camadapter-v01:
    Container ID:  docker://d5164225f71a6b50c4380556979bc7c00fe38613ad16ecd1faeace017bfd1bf9
    Image:         368971480733.dkr.ecr.eu-central-1.amazonaws.com/cloudia/ow_amq_wmq_loop:OW_AMQ_WMQ_LOOP-v0.1.1624-SADe_06_2020
    Image ID:      docker-pullable://368971480733.dkr.ecr.eu-central-1.amazonaws.com/cloudia/ow_amq_wmq_loop@sha256:a27e221ef8d6a4840a060eec390727f88e0927e33ebbe03766e7ff45eab607fc
    Ports:         8443/TCP, 8084/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /bin/sh
      -c
      cd /opt/cloudia/ppr/ppr-timetableout-camadapter-v01; java -DSERVICE_CONTAINER=ppr-timetableout-camadapter-v01 -Dspring.profiles.active=cloudia-abn -Dcom.ibm.mq.cfg.useIBMCipherMappings=false -Dspring.config.location=file:/opt/config/app-config.yaml,file:/opt/config/env.yaml,file:/opt/cloudia-config-secret/cloudia-config-secret.yaml,file:/opt/mq-config-secret/mq-connection.yaml,classpath:config/application-generic-aws.yml,classpath:config/application-utils-aws.yml -Xms1000m -Xmx1000m -jar /opt/ow-amq-wmq-loop.jar
    State:          Running
      Started:      Thu, 11 Jun 2020 01:21:42 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     400m
      memory:  1000Mi
    Requests:
      cpu:      100m
      memory:   500Mi
    Liveness:   http-get http://:8084/PPR/Timetableout/CAMAdapter/V01/health delay=200s timeout=10s period=80s #success=1 #failure=5
    Readiness:  http-get http://:8084/PPR/Timetableout/CAMAdapter/V01/health delay=220s timeout=5s period=50s #success=1 #failure=3
    Environment:
      STAKATER_PPR_TIMETABLEOUT_CAMADAPTER_V01_CM_CONFIGMAP:  0f49f7a571f36e29f7482eff42e568097fd377e5
      STAKATER_MQ_CONFIG_SECRET_SECRET:                       a715da5e437d474638e45bafc8ba7d755da423a5
      STAKATER_CLOUDIA_TRUSTSTORE_CM_CONFIGMAP:               da39a3ee5e6b4b0d3255bfef95601890afd80709
      STAKATER_CLOUDIA_CONFIG_SECRET_SECRET:                  52370aed941e0d62c296351cfe5d258b8128949f
    Mounts:
      /etc/certs from certs-volume (rw)
      /opt/cloudia-config-secret from cloudia-config-volume (ro)
      /opt/cloudia/ppr/ppr-timetableout-camadapter-v01 from configdir (rw)
      /opt/config from appconfig-volume (rw)
      /opt/mq-config-secret from mq-config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-8rts8 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  configdir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  appconfig-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ppr-timetableout-camadapter-v01-cm
    Optional:  false
  certs-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      cloudia-truststore-cm
    Optional:  false
  cloudia-config-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cloudia-config-secret
    Optional:    false
  mq-config-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  mq-config-secret
    Optional:    false
  default-token-8rts8:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-8rts8
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

k8s version

Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.2", GitCommit:"59603c6e503c87169aea6106f57b9f242f64df89", GitTreeState:"clean", BuildDate:"2020-01-18T23:30:10Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15+", GitVersion:"v1.15.11-eks-af3caf", GitCommit:"af3caf6136cd355f467083651cc1010a499f59b1", GitTreeState:"clean", BuildDate:"2020-03-27T21:51:36Z", GoVersion:"go1.12.17", Compiler:"gc", Platform:"linux/amd64"}

lixiang233 commented 4 years ago

@jia2 I noticed that your pods have EmptyDir volumes:

Volumes:
  configdir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>

Descheduler will do some checks before evicting pods, and pods with EmptyDir(local storage) won't pass these checks, so they won't be evicted, if you want descheduler to evict them, you can add this annotation descheduler.alpha.kubernetes.io/evict.

You can reference to this section for more information.

seanmalloy commented 4 years ago

The other option is to run the descheduler with the --evict-local-storage-pods CLI option. This will enable evicting pods that have local storage.

jia2 commented 4 years ago

Thank @lixiang233 and @seanmalloy for your hints.

seanmalloy commented 4 years ago

/kind documentation /remove-kind bug /close

@jia2 I'm closing this issue because this is expected behavior. By default pods with local storage will not be evicted by the descheduler. Feel free to reopen this issue or post in sig-scheduling on k8s Slack if you need further assistance. Thanks!

k8s-ci-robot commented 4 years ago

@seanmalloy: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/descheduler/issues/241#issuecomment-655093757): >/kind documentation >/remove-kind bug >/close > >@jia2 I'm closing this issue because this is expected behavior. By default pods with local storage will not be evicted by the descheduler. Feel free to reopen this issue or post in `sig-scheduling` on k8s Slack if you need further assistance. Thanks! Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

kubernetes-sigs / descheduler

RemoveDuplicates doesn't work as expected #241