Closed mf-giwoong-lee closed 1 year ago
I have a gpu machines which has 4gpus.
All gpus have same memory capacity (23GiB).
I run 5 gpu pods which use 1 gpu and 10GiB gpu memory.
But the k8s only launch 4 gpu pods in gpu0,1 and the remaining pod is in pending states.
Generally, the 5 gpu pods are launched in this machine due to gpushare-scheduler, but only 4 pods are launched.
Why this phenomenon is happened?
I have a gpu machines which has 4gpus.
All gpus have same memory capacity (23GiB).
I run 5 gpu pods which use 1 gpu and 10GiB gpu memory.
But the k8s only launch 4 gpu pods in gpu0,1 and the remaining pod is in pending states.
Generally, the 5 gpu pods are launched in this machine due to gpushare-scheduler, but only 4 pods are launched.
Why this phenomenon is happened?