kubernetes / autoscaler

Autoscaling components for Kubernetes
Apache License 2.0
8.11k stars 3.98k forks source link

Cluster AutoScaler scaling down the ondemand nodegroups's node without any reason #7451

Open AmanPathak-DevOps opened 3 weeks ago

AmanPathak-DevOps commented 3 weeks ago

Which component are you using?:

What version of the component are you using?:

Component version:

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: v1.30.1
Kustomize Version: v5.0.4-0.*********
Server Version: v1.30.4-eks-a737599

What environment is this in?: It's in Dev environment

We are using AWS cloud What did you expect to happen?:

So, basically, I am using Cluster AutoScaler to autoscale the nodes in two node groups(on-demand ng and spot ng). I have implemented NTH and Priority Expander to give preference to Spot instances. As Spot instance can be down due to the bidding system, we have NTH for that. However, I cannot figure out why the on-demand node is getting down and going in Unknown status for more than 6-8 hours. Also, if the on-demand node is going down, CA should create a new one, but it's not doing so and taking more than 5-6 hours. What happened instead?:

I am expecting that CA scale up and scale down the nodes should be in some time e.g. 5-10 minutes. but the on-demand instances are going down without any reason and CA creating on-demand nodes which are taking more than 5-6 hours. As the node is in Unknown status, all the pods that are running on the on-demand node are in the Terminating instance for more than 4-5 hours which is very frustrating as I am facing downtime due to RollUpdate Strategy.

How to reproduce it (as minimally and precisely as possible):

Here is the script of deployment and Kubernetes component that I am using for the CA

---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<Role-ARN>
  name: cluster-autoscaler
  namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["events", "endpoints"]
    verbs: ["create", "patch"]
  - apiGroups: [""]
    resources: ["pods/eviction"]
    verbs: ["create"]
  - apiGroups: [""]
    resources: ["pods/status"]
    verbs: ["update"]
  - apiGroups: [""]
    resources: ["endpoints"]
    resourceNames: ["cluster-autoscaler"]
    verbs: ["get", "update"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["watch", "list", "get", "update"]
  - apiGroups: [""]
    resources:
      - "namespaces"
      - "pods"
      - "services"
      - "replicationcontrollers"
      - "persistentvolumeclaims"
      - "persistentvolumes"
    verbs: ["watch", "list", "get"]
  - apiGroups: ["extensions"]
    resources: ["replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["policy"]
    resources: ["poddisruptionbudgets"]
    verbs: ["watch", "list"]
  - apiGroups: ["apps"]
    resources: ["statefulsets", "replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["batch", "extensions"]
    resources: ["jobs"]
    verbs: ["get", "list", "watch", "patch"]
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["create"]
  - apiGroups: ["coordination.k8s.io"]
    resourceNames: ["cluster-autoscaler"]
    resources: ["leases"]
    verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["create", "list", "watch"]
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
    verbs: ["delete", "get", "update", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '8085'
    spec:
      priorityClassName: system-cluster-critical
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534
        fsGroup: 65534
        seccompProfile:
          type: RuntimeDefault
      serviceAccountName: cluster-autoscaler
      containers:
        - image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.31.0
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 600Mi
            requests:
              cpu: 100m
              memory: 600Mi
          command: 
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --expander=priority
            - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<cluster-name>
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs/ca-certificates.crt
              readOnly: true
          imagePullPolicy: "Always"
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
                - ALL
            readOnlyRootFilesystem: true
      volumes:
        - name: ssl-certs
          hostPath:
            path: "/etc/ssl/certs/ca-bundle.crt"

Anything else we need to know?:

I1101 04:02:57.320439 1 aws_manager.go:188] Found multiple availability zones for ASG "-e2c94f70-6b1c-a9af-47eb-fea9a5915955"; using ap-south-1c for failure-domain.beta.kubernetes.io/zone label I1101 04:02:57.320612 1 filter_out_schedulable.go:66] Filtering out schedulables I1101 04:02:57.320714 1 klogx.go:87] failed to find place for logging/fluentd-jswwh: cannot put pod fluentd-jswwh on any node I1101 04:02:57.320729 1 filter_out_schedulable.go:123] 0 pods marked as unschedulable can be scheduled. I1101 04:02:57.320738 1 filter_out_schedulable.go:86] No schedulable pods I1101 04:02:57.320743 1 filter_out_daemon_sets.go:40] Filtering out daemon set pods I1101 04:02:57.320748 1 filter_out_daemon_sets.go:49] Filtered out 1 daemon set pods, 0 unschedulable pods left I1101 04:02:57.320766 1 static_autoscaler.go:557] No unschedulable pods I1101 04:02:57.320797 1 static_autoscaler.go:580] Calculating unneeded nodes I1101 04:02:57.320812 1 pre_filtering_processor.go:67] Skipping ip-10-1-137-190.ap-south-1.compute.internal - node group min size reached (current: 1, min: 1) I1101 04:02:57.320898 1 eligibility.go:104] Scale-down calculation: ignoring 5 nodes unremovable in the last 5m0s I1101 04:02:57.320940 1 static_autoscaler.go:623] Scale down status: lastScaleUpTime=2024-11-01 03:42:50.431875787 +0000 UTC m=+127994.930106250 lastScaleDownDeleteTime=2024-10-31 06:18:29.370821589 +0000 UTC m=+50933.869052042 lastScaleDownFailTime=2024-10-30 15:09:57.022381669 +0000 UTC m=-3578.479387878 scaleDownForbidden=false scaleDownInCooldown=false I1101 04:02:57.320969 1 static_autoscaler.go:644] Starting scale down

Node Status

ip-10-1-137-190.ap-south-1.compute.internal NotReady 23h v1.30.4-eks-a737599

adrianmoisey commented 3 weeks ago

/area cluster-autoscaler