Open romintomasetti opened 2 years ago
Had a similar issue when I was building a Docker image with ROCm support.
A non-root user can't access the GPU resources and has to run commands as sudo
for GPU access.
A user inside the docker container has to be a member of the video
and render
groups to access the GPU without sudo
The video
group exists by default on Debian systems and has the fixed id of 44
, so there's no need to do anything as long as the group on the host system and inside the container have the same name and id.
The render
group, on the other hand, is created by the amdgpu-install
script on the host system and the id gets randomly assigned, for example it can be one of the following: 104
, 109
or 110
Using Docker ENTRYPOINT
to dynamically create and assign the render group with the host system render group id.
Create an entrypoint.sh
script, and add it during the build to the image.
The script will create the render group with the host's group id and assign the user to the video and render groups.
#!/bin/bash
sudo groupadd --gid $RENDER_GID render
sudo usermod -aG render $USERNAME
sudo usermod -aG video $USERNAME
exec "$@"
Inside the Dockerfile we create a new user and copy the entrypoint.sh script to the image. A basic example:
FROM ubuntu
ENV USERNAME=rocm-user
ARG USER_UID=1000
ARG USER_GID=$USER_UID
RUN groupadd --gid $USER_GID $USERNAME \
&& useradd --uid $USER_UID --gid $USER_GID -m $USERNAME \
&& echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME \
&& chmod 0440 /etc/sudoers.d/$USERNAME
COPY entrypoint.sh /tmp
RUN chmod 777 /tmp/entrypoint.sh
USER $USERNAME
ENTRYPOINT ["/tmp/entrypoint.sh"]
CMD ["/bin/bash"]
docker build -t rocm-image .
When starting the container pass the RENDER_GID
environment variable. Let's assume the Docker image is called rocm-image
.
export RENDER_GID=$(getent group render | cut -d: -f3) && docker run -it --device=/dev/kfd --device=/dev/dri -e RENDER_GID --group-add $RENDER_GID rocm-image /bin/bash
Just add the following code to .devcontainer/devcontainer.json
file and you're good to go. A VS Code devcontainer with GPU access.
{
"build": { "dockerfile": "./Dockerfile" }
"overrideCommand": false,
"initializeCommand": "echo \"RENDER_GID=$(getent group render | cut -d: -f3)\" > .devcontainer/devcontainer.env",
"containerEnv": { "HSA_OVERRIDE_GFX_VERSION": "10.3.0" },
"runArgs": [
"--env-file=.devcontainer/devcontainer.env",
"--device=/dev/kfd",
"--device=/dev/dri"
]
}
On one of our machines GID of render
group on host overlapped with ssh
group in the image, so groupadd
from the init script failed. It's best to replace use the group id in the following usermod
to still get acceptable result in such a scenario.
Hi @romintomasetti @sergejcodes, thank you for both reporting this issue and providing a detailed solution to the problem. This has been addressed in our newer images by defaulting to a root user in order to maintain access to GPU resources. Please let me know if we can close out this issue.
@harkgill-amd in many cases, clusters (Kubernetes) have security policies that prevent containers running as root, this limitation will prevent MANY companies from being able to use AMD GPUs for their AI workloads.
In Kubernetes, this is likely something that your https://github.com/ROCm/k8s-device-plugin can resolve by checking the host's render
group and adding it as a supplementalGroups
in the Pod's securityContext
, but its problematic if the cluster has multiple nodes which don't have the same GID for their render
group.
However, I feel there must be a more clean solution, because running Nvidia GPUs have no such problems either on local docker or their Kubernetes device plugin. I would check what they are doing, but it might be something like they have every device mount owned by a constant GID (e.g. 0
, or something the user configures) and ensure the docker container run as users who have this group.
Here is the related issue on the AMD Device Plugin repo: https://github.com/ROCm/k8s-device-plugin/issues/39
Also, for context, when using Nvidia GPUs, you don't mount them with the --device
parameter, but instead use the --gpus
parameter, so perhaps this is part of their workaround:
For reference, here is information about the --device
arg of docker run
, perhaps there we need to explicitly allow read write with the :rwm
suffix (which is the default), or set something on --device-cgroup-rule
:
Although, I guess the real question is why AMD ever thought it was a good idea to not have a static GID for the render
group. Perhaps the solution is to deprecate the render
group and always use video
or make a new group.
Hi @harkgill-amd, I don't think this should be closed as the inherent problem with using a non-root user is still prevalent, and there isn't a clean solution for this.
@thesuperzapper and @gigabyte132, thank you for the feedback. We are currently exploring the possibility of using udev
rules to access GPU resources in place of render
groups. The steps would be the following
/etc/udev/rules.d/70-amdgpu.rules
with the following content:
KERNEL=="kfd", MODE="0666"
SUBSYSTEM=="drm", KERNEL=="renderD*", MODE="0666"
udev
rules with:
sudo udevadm control --reload-rules && sudo udevadm trigger
This configuration grants users read and write access to AMD GPU resources. From there, you can pass access to these devices into a container by specifying --device /dev/kfd --device /dev/dri
in your docker run command. To restrict access to a subset of GPUs, please see the following documentation.
I ran this setup with the rocm/rocm-terminal
image and am able to access GPU resources without any render group mapping or root privileges. Could you please give this a try on your end and let me know what you think?
@harkgill-amd while changing the permissions on the host might work, I will note that this does not seem to be required for Nvidia GPUs.
I imagine that this is because they mount the device paths specifically because /dev/dri
is not the path of the actual device, so docker's --device
mount (which claims to give the container read/write permissions) does not correctly change its permissions.
Because specifying each device is obviously a pain for end users, they added a custom --gpus
feature (also see these docs) which requires users to install the nvidia-container-toolkit.
Also want to highlight the differences between the Kubernetes Device Plugin for AMD/Nvidia, as this is where most people are using lots of GPUs, and the permission issues also occur on AMD but not Nvidia:
@harkgill-amd after a lot of testing, it seems like the major container runtimes (including docker and containerd) don't actually change the permissions of devices mounted with --device
like they claim to.
For example, you would expect the following command to mount /dev/dri/card1
with everybody having rw
, but it does not:
docker run --device /dev/dri/card1 ubuntu ls -la /dev/dri
# OUTPUT:
# total 0
# drwxr-xr-x 2 root root 60 Oct 24 18:52 .
# drwxr-xr-x 6 root root 360 Oct 24 18:52 ..
# crw-rw---- 1 root 110 226, 1 Oct 24 18:52 card1
This is also seemingly happens on Kubernetes despite the AMD Device plugin requesting that the container be given rw
on the device.
@harkgill-amd We need to find a generic solution which allows a non-root container to be run on any server (with a default install of AMD drivers)
This problematic because there is no standard GID for the render
group, and the container runtimes don't respect requests to change the permissions of mounted devices.
Note, it seems like ubuntu has a default udev rule under /usr/lib/udev/rules.d/50-udev-default.rules
which makes render
the owner of /dev/dri/renderD*
and video
the owner of everything else in /dev/dri/
.
Give everyone read/write /dev/dri/renderD*
on the host (like you proposed above):
/dev/dri/renderD*
devices have 0666
permissions.Create a new standard GID to add as an owner of /dev/dri/renderD*
(or use video=44
).
Do what Nvidia does, and don't mount anything under /dev/dri/
in the container, and instead mount something like the /dev/nvidia0
devices which have crw-rw-rw-
and seemingly are how CUDA apps interact with the GPUs.
Mount the devices as bind volumes rather than as actual devices:
amd.com/gpu: 1
limit, not volumes.Automatically add the detected GID of the render
group to the user as the container starts (because we don't know what the GID is before we start running on a specific server):
/etc/group
which would obviously allow root escalationFigure out why all the container runtimes are not respecting the request to change file permissions on device mounts.
Initial issue
As stated in https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation_new.html#setting-permissions-for-groups, for Ubuntu 20 and above, the user needs to be part of the
render
group.Therefore, we need to create the
render
group in the docker image. The following would work:We might also want to update the documentation because the
docker run
command should contain--group-add render
for Ubuntu 20 and above.Update - 10th June 2022
I made the following experiments. The user I'm logged in on the host is part of the
render
group. My user ID is1002
.works because it runs as
root
(with user ID 0 on the host) andwill not work with
Unable to open /dev/kfd read-write: Permission denied
.will not work because inside of
rocm/dev-ubuntu-20.04:5.1
there is no render group.will work again.
Therefore, I see 2 ways of fixing this.
Add a render group in the Docker image with ID 109 by default. This would be a "build time" fix and would break as soon as the host render group ID is not 109. The group ID could be passed as an argument of the build (
ARG
) but the image would not be portable.--group-add $(getent group render | cut -d':' -f 3)
.