Closed abhijit-dev82 closed 1 month ago
/sig scheduling /sig autoscaling
/remove-sig scheduling /remove-sig autoscaling /sig node /sig networking
I think the fix here should be to make kube-proxy part of the Node readiness checks.
@alculquicondor: The label(s) sig/networking
cannot be applied, because the repository doesn't have them.
I think the fix here should be to make kube-proxy part of the Node readiness checks.
but is this cluster using kube-proxy as an static pod? the issue says this is a vsphere cluster with vmware OS, I thought those were using antrea ...
what happens if there are more static pods? you have to add all the static pods as part of the node readiness check, that will solve the scheduling problem but will impact the node startup readiness
I see, thanks for the clarification.
But more generally, any static pod would cause problems to scheduling.
you have to add all the static pods as part of the node readiness check, that will solve the scheduling problem but will impact the node startup readiness
But can a node really be considered ready if the static pods are not ready?
/triage accepted
/cc
This issue has not been updated in over 1 year, and should be re-triaged.
You can:
/triage accepted
(org members only)/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
/close as duplicate of #115325
@alculquicondor: Closing this issue.
What happened?
On a k8s 1.26 cluster with cluster autoscaler enabled, with min size as 1 and max as 5 , scaled out an application deployment to 350 replicas. The worker nodes size was of 1 worker node , after scaling out the application deployment, Pods go to pending state and trigger the cluster autoscaler to scale out to 4 worker nodes
After the nodes are ready Pods get scheduled on the new nodes, but observe one Pod which goes to "OutOfPods" state.
`QoS Class: BestEffort Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Topology Spread Constraints: kubernetes.io/hostname:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/name=argocd-dex-server
Events:
Type Reason Age From Message
Normal TriggeredScaleUp 25m cluster-autoscaler pod triggered scale-up: [{MachineDeployment/autoscaler-cc265-11ns1/autoscaler-cc265-11ns1-c1-np-1-worker-7wmpd 1->4 (max: 5)}] Warning FailedScheduling 24m (x2 over 25m) default-scheduler 0/2 nodes are available: 1 Too many pods, 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling.. Warning FailedScheduling 22m default-scheduler 0/5 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 Too many pods, 2 node(s) had untolerated taint {node.kubernetes.io/not-ready: }. preemption: 0/5 nodes are available: 2 No preemption victims found for incoming pod, 3 Preemption is not helpful for scheduling.. Normal Scheduled 22m default-scheduler Successfully assigned default/argocd-dex-server-58cb8749b4-zmzbj to autoscaler-cc265-11ns1-c1-np-1-worker-7wmpd-696954db68m8 Warning OutOfpods 22m kubelet Node didn't have enough resource: pods, requested: 1, used: 110, capacity: 110 root@4211450cf3409357f8aea6c23011ec78 [ ~ ]# `
What did you expect to happen?
The Pod which cannot be accomodated on a worker node should not have been scheduled on it and should stay in "Pending" state. This would have triggered cluster autoscaler scale out and the Pod should have scheduled on this node.
How can we reproduce it (as minimally and precisely as possible)?
Create a cluster with Cluster Autoscaler enabled with Min = 1 and Max=5 . Scale out application deployment to a number > 300.
Anything else we need to know?
Slack discussion: https://kubernetes.slack.com/archives/C09TP78DV/p1691159010212509
The scheduler doesnt take into account if any static Pods are coming up on a new node before scheduling the Pending Pods on the new scaled out node. Hence the scheduled Pod gets pushed to "OutOfPods" state after being scheduled on a new node.
Kubernetes version
Cloud provider
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)