Question: Gang scheduling not working while cpu usage is close to the limitations and pods could still be scheduled

kubernetes-retired / kube-batch

A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC

Apache License 2.0

1.08k stars 264 forks source link

Question: Gang scheduling not working while cpu usage is close to the limitations and pods could still be scheduled #938

Closed qw2208 closed 4 years ago

qw2208 commented 4 years ago

Hi,

/kind bug

I've got the problem related to gang scheduling of kube-batch. When cpu request are close to the node limits, the gang scheduling seems a failure. One of the pods is not started and reported error by Kubelet while other pods are running. The pods running on the drought node were scheduled hours ago.

See my following experiment as an example. I've created a podgroup and a batch of pods with the following configs: What I expect is that all the pods are pending and Failed to be scheduled. However, I got all the pods got scheduled but one of them gave the error of outofcpu: The node status is as following.

Another observation is: Seems if the gap is large, kube-batch pends all the pods, which meets the expectation

qw2208 commented 4 years ago

/cc @k82cn Just kindly ping:)

Cherishty commented 4 years ago

Hi Team,

I meet the same issue as well, and my scenario is:

I have a pod-group that contains 4 pods, all of which are 1cpu, 1GB memory. Although my cluster only has 3.6cpu, these pods are scheduled to the nodes successfully, then the kubelet process on one node print the error message like above, and one pod fails to create.

However, if I apply for 4 * 1.5cpu, 1GB, gang-scheduling will work and all pods are rejected to schedule

Reminder: in my scenario, I reserve 0.5cpu per node for system-use when init the k8s cluster, which means my VM has 8cpu, but only 7.5cpu is allocatable(seems @qw2208 also has this configuration). Is this the root cause?

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 4 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

fejta-bot commented 4 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

k8s-ci-robot commented 4 years ago

@fejta-bot: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/kube-batch/issues/938#issuecomment-702809485): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Send feedback to sig-testing, kubernetes/test-infra and/or [fejta](https://github.com/fejta). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.