Open kunal642 opened 6 months ago
@archlitchi Is the plugin version 1.9.0 compatible with volcano 1.8.2?
hey @archlitchi,
We got the hard isolation working by mounting the "/tmp/gpu" and "/tmp/gpulock" to the container explicitly.
Can you explain why we are not able to assign more than 4 vgpu to a single pod (we have 4 GPU cards on a single node).
@archlitchi Is the plugin version 1.9.0 compatible with volcano 1.8.2?
i recommend to use 1.9.0
hey @archlitchi,
We got the hard isolation working by mounting the "/tmp/gpu" and "/tmp/gpulock" to the container explicitly.
Can you explain why we are not able to assign more than 4 vgpu to a single pod (we have 4 GPU cards on a single node).
yes, there are only 4 devices in /dev folder, so you can use 4 gpus at most, we can't mount a non-exist gpu device into container and can be recognized by nvidia-driver
Does this mean that device plugin only restricts memory and not the compute resources?
If no then how can a pod use the full gpu using vgpu config?
Does this mean that device plugin only restricts memory and not the compute resources?
If no then how can a pod use the full gpu using vgpu config?
it can restrict compute resources by specifying volcano.sh/vgpu-cores
, if you want to use the full gpu, only specify volcano.sh/vgpu-number
inside task
got it, is there a way to check how many cores are allocated in the container? if we configure 50% cores, then we want to make sure that only 50% is allocated
Hi @archlitchi, Creating this issue as a continuation of the conversation we were having on the volcano issue #3384