NVIDIA / enroot

A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.
Apache License 2.0
649 stars 94 forks source link

Enroot with NVIDIA MPS? #206

Closed scottcs2 closed 2 months ago

scottcs2 commented 3 months ago

Hi,

I am trying to get Enroot to play nicely with NVIDIA MPS. However, I encounter the following problem when running NVIDIA nemo containers using enroot while MPS control daemon is running.

Steps:

nvidia-cuda-mps-control -d
export NVIDIA_DRIVER_CAPABILITIES=all
enroot start --root -w --env NVIDIA_DRIVER_CAPABILITIES nemo2

Causes this error to be printed:

ERROR: The NVIDIA Driver is present, but CUDA failed to initialize.  GPU functionality will not be available.
   [[ Mapping of buffer object failed (error 205) ]]

If I kill the MPS daemon, then the container works. How do I use MPS and an enroot container simultaneously?

Thanks, Scott

3XX0 commented 3 months ago

Did you try with a simple CUDA container? What's your driver version and GPUs inside the container?

You might want to move this issue to https://github.com/NVIDIA/libnvidia-container since it's specific to GPU support.

scottcs2 commented 2 months ago

The issue is if I specify --root as an enroot argument, then it breaks with NVIDIA MPS. Without --root, the container is able to access the GPUs even with MPS enabled. I will just ensure my users do not specify --root when they launch the enroot containers.