NVIDIA / DCGM

NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs
Apache License 2.0
373 stars 49 forks source link

Is there a way to disallow sharing of MIG devices? #82

Open starry91 opened 1 year ago

starry91 commented 1 year ago

I see we have ability to set the GPU compute mode to exclusive process on T4 cards to disallow sharing of a GPU to multiple processes. Is there a way to do the same thing for MIG devices? The exclusive process compute mode doesn't seem to be supported for MIG devices.

# sudo nvidia-smi -c 3                                                  
Setting compute mode to EXCLUSIVE_PROCESS is not supported on a MIG-enabled device.
Unable to set the compute mode for GPU 00000000:31:00.0: Not Supported             
Treating as warning and moving on.                                                 
All done.     
nikkon-dev commented 1 year ago

@starry91,

I'm sorry, but this place may not be the best place to ask about MIG or driver configuration. DCGM doesn't have any control over those matters.