apptainer / singularity

Singularity has been renamed to Apptainer as part of us moving the project to the Linux Foundation. This repo has been persisted as a snapshot right before the changes.
https://github.com/apptainer/apptainer
Other
2.53k stars 424 forks source link

Unable to use GPUs in a writable container #5903

Closed Amir-Arsalan closed 2 years ago

Amir-Arsalan commented 3 years ago

I recently built a Singularity container for a Python package named mujoco-py with GPU-enabled rendering (see here). The problem is when this package is imported in Python, it needs to create some new files in its corresponding egg directory /usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/. Unfortunately this is not possible as the file system becomes read-only once the image is built. I also did python3.7 -c 'import mujoco_py' when the container is being built but importing it again requires building the same files, and I got the same errors.

I thought this time I can build a writable container and the problem would be resolved. I built my container with Singularity 3.7.1 but when I did singularity shell --nv -w img.sif I got the following warnings and doing nvidia-smi throws an error. Because of this, doing import mujoco_py in python does not use GPU for rendering because the package does not recognize a GPU in the container:

WARNING: nv files may not be bound with --writable
WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container
WARNING: Skipping mount /usr/bin/nvidia-smi [files]: /usr/bin/nvidia-smi doesn't exist in container
WARNING: Skipping mount /usr/bin/nvidia-debugdump [files]: /usr/bin/nvidia-debugdump doesn't exist in container
WARNING: Skipping mount /usr/bin/nvidia-persistenced [files]: /usr/bin/nvidia-persistenced doesn't exist in container
WARNING: Skipping mount /usr/bin/nvidia-cuda-mps-control [files]: /usr/bin/nvidia-cuda-mps-control doesn't exist in container
WARNING: Skipping mount /usr/bin/nvidia-cuda-mps-server [files]: /usr/bin/nvidia-cuda-mps-server doesn't exist in container
WARNING: Skipping mount /var/run/nvidia-persistenced/socket [files]: /var/run/nvidia-persistenced/socket doesn't exist in container
Singularity> nvidia-smi
bash: nvidia-smi: command not found

Note that to build these containers I pulled the NVIDIA Ubuntu 18.04 Docker image with CUDA 10.2:

Bootstrap: docker
From: nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04

My first question is, would it be be possible for Singularity developers to let the admin make a directory writable (e.g. /usr) so that software/packages that have to build new files during run time can be used in a container? Second, is there a way to fix these warnings issues for the NVIDIA stuff in a writable container?

serheang commented 3 years ago

Hi, You might want to try persistent overlay: https://sylabs.io/guides/3.7/user-guide/persistent_overlays.html

Thank you. Warmest regards, Ser Heang TAN On 3 Apr 2021, 12:09 PM +1100, Amir Arsalan Soltani @.***>, wrote:

I recently built a Singularity container for a Python package named mujoco-py with GPU-enabled rendering (see here). The problem is when this package is imported in Python, it needs to create some new files in its corresponding egg directory /usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/. Unfortunately this is not possible as the file system becomes read-only once the image is built. I also did python3.7 -c 'import mujoco_py' when the container is being built but importing it again requires building the same files, and I got the same errors. I thought this time I can build a writable container and the problem would be resolved. I built my container with Singularity 3.7.1 but when I did singularity shell --nv -w img.sif I got the following warnings and doing nvidia-smi throws an error. Because of this, doing import mujoco_py in python does not use GPU for rendering because the package does not recognize a GPU in the container: WARNING: nv files may not be bound with --writable WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container WARNING: Skipping mount /usr/bin/nvidia-smi [files]: /usr/bin/nvidia-smi doesn't exist in container WARNING: Skipping mount /usr/bin/nvidia-debugdump [files]: /usr/bin/nvidia-debugdump doesn't exist in container WARNING: Skipping mount /usr/bin/nvidia-persistenced [files]: /usr/bin/nvidia-persistenced doesn't exist in container WARNING: Skipping mount /usr/bin/nvidia-cuda-mps-control [files]: /usr/bin/nvidia-cuda-mps-control doesn't exist in container WARNING: Skipping mount /usr/bin/nvidia-cuda-mps-server [files]: /usr/bin/nvidia-cuda-mps-server doesn't exist in container WARNING: Skipping mount /var/run/nvidia-persistenced/socket [files]: /var/run/nvidia-persistenced/socket doesn't exist in container Singularity> nvidia-smi bash: nvidia-smi: command not found My first question is, would it be be possible for Singularity developers to let the admin make a directory writable (e.g. /usr) so that software/packages that have to build new files during run time can be used in a container? Second, is there a way to fix these warnings issues for the NVIDIA stuff in a writable container? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Amir-Arsalan commented 3 years ago

@serheang Thank you. I think --overlay is potentially what I am looking for. However, I am a bit confused how I can use --overlay to write into /usr without using sudo. Note that the created files by the package do not need to be permanently stored. The documentation seems to suggest it is only possible to write to a directory in / if one uses sudo but please correct me if I'm wrong.

serheang commented 3 years ago

Hi, If you look further down, you will find this: " To manage permissions in the overlay, so the container is writable by unprivileged users you can create a directory structure on your host, set permissions on it as needed, and include it in the overlay with the -d option to mkfs.ext3: " This way, you can use it as non privilege user.

Thank you. Warmest regards, Ser Heang TAN On 3 Apr 2021, 2:11 PM +1100, Amir Arsalan Soltani @.***>, wrote:

@serheang Thank you. I think --overlay is potentially what I am looking for. However, I am a bit confused how I can use --overlay to write into /usr without using sudo. Note that the created files by the package do not need to be permanently stored. The documentation seems to suggest it is only possible to write to a directory in / if one uses sudo but please correct me if I'm wrong. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Amir-Arsalan commented 3 years ago

@serheang Sure, I saw this but what this doesn't seem to be applicable to a directory that Unix has by default, such as /usr. Am I wrong on this?

serheang commented 3 years ago

Hi Amir, I am not sure I understand you correctly. If you want to write to "/usr" in the container, you will need to be running the container as a privileged user. So you are right that these directories are still belongs to root:

drwxr-xr-x  6 root  root  4096 Mar 26 03:45 var
drwxr-xr-x  5 root  root  4096 Apr  2 22:34 ..
drwxr-xr-x  8 root  root  4096 Apr  2 22:56 usr
drwxr-xr-x  2 root  root  4096 Apr  2 23:39 bin
drwxr-xr-x  2 root  root  4096 Apr  2 23:43 lib64
drwxr-xr-x  5 root  root  4096 Apr  2 23:43 lib
drwxr-xr-x 40 root  root  4096 Apr  3 21:35 etc

However, if you try the overlay, you will notice that you can write to "/" in the container because that belongs to the user who created the overlay, as shown here (belongs to me):

drwxr-xr-x  8 serheang serheang 4096 Apr  3 21:49 .

By the way, if you mount the overlay image somewhere as root, you will be able to change those "root" only directory to your user. For example, I changed "/usr" and "/usr/bin" to own by me:

drwxr-xr-x 1 serheang root  4096 Apr  3 21:51 usr
drwxr-xr-x 1 serheang root 12288 Apr  3 21:55 usr/bin/

And then, I am able to create files and directories within those directories when I am running in the container as myself (without sudo).

Do try it and see whether this meet your requirements.

Thank you. Warmest regards, Ser Heang TAN

On Sat, Apr 3, 2021 at 2:57 PM Amir Arsalan Soltani < @.***> wrote:

@serheang https://github.com/serheang Sure, I saw this but what this doesn't seem to be applicable to a directory that Unix has by default, such as /usr. Am I wrong on this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hpcng/singularity/issues/5903#issuecomment-812805924, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOGZO3S4RE3YUGC3WNUFI3TG2G3NANCNFSM42JVFSNA .

Amir-Arsalan commented 3 years ago

@serheang The problem is I need to use this container on a computing cluster where I won't be able to use sudo anymore. Would I still be able to use overlay to write into /usr without using sudo?

serheang commented 3 years ago

Yes, you should be able if you manage to create the overlay somewhere else that you have root permission and change the permission for those directories accordingly.  Try  it and see how it goes.

Thank you. Warmest regards, Ser Heang TAN On 3 Apr 2021, 10:42 PM +1100, Amir Arsalan Soltani @.***>, wrote:

@serheang The problem is I need to use this container on a computing cluster where I won't be able to use sudo anymore. Would I still be able to use overlay to write into /usr without using sudo? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Amir-Arsalan commented 3 years ago

@serheang Yes, I will create the container somewhere with root permission. However, the container will be used by other users with other usernames on the cluster. Do you mean I can use chmod to change the permission of /usr or do you mean I need to change the ownership? I'm afraid changing the ownership will not allow me to use the container in our cluster still. Also, I'm looking at the overlay documentation and things seem a bit complicated to me. Would you be able to give me an example on how I should create an container with overlay, change permission of /usr and write something in /usr without sudo?

Amir-Arsalan commented 3 years ago

Also, assuming that overlay can make it possible for multiple users to write into /usr, I'm thinking overlay gives the illusion of making changes into /usr and the changes made by each user will not be saved into the container permanently. Is that correct?

Update: based on the description given in overlay documentation it seems that each users' change will be only shown for themselves and not saved in the container.

Amir-Arsalan commented 3 years ago

@serheang I've been playing around with overlay for the past hour but I'm still unable to wrap myself around it. Below you can see what I did which takes me much closer to what I wanted but it does not seem to resolve my issue properly:

If I don't use sudo I get an error saying FATAL: container creation failed: while setting overlay session layout: only root user can use sandbox as overlay. So this solution will not work on the compute cluster because I won't be able to have sudo access there. Plus, some other users will be using this container in parallel. I know you said it is possible to use overlay without sudo but I'm not sure how I can do it at the moment. The documentation is not very clear to me regarding this. I'd appreciate it if you could you help me figure out how I can make resolve this issue.

serheang commented 3 years ago

Yes, This is correct.  without the overlay, the / will be back to whatever in the sif.

As for your other question, I will try it out and let you know.

Thank you. Warmest regards, Ser Heang TAN On 3 Apr 2021, 11:58 PM +1100, Amir Arsalan Soltani @.***>, wrote:

Also, assuming that overlay can make it possible for multiple users to write into /usr, I'm thinking overlay gives the illusion of making changes into /usr and the changes made by each user will not be saved into the container permanently. Is that correct? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

serheang commented 3 years ago

Hi, Did you found the answer to your question? Here is the steps I used to create the overlay:

cd /tmp
mkdir -p overlay/upper overlay/work
dd if=/dev/zeo of=testoverlay.img bs=1M count=512
mkfs.ext3 -d overlay testoverlay.img

Now, I am going to create a centos7 sif:

singularity pull docker://centos:7.8.2003

Then I will mount and update the stuff I want onto testoverlay.img:

sudo singularity run -o testoverlay.img centos-7.8.2003.sif

Since vim is not installed in the original docker://centos:7.8.2003, I will install it in the overlay:

yum install -y vim

Now exit from the singularity prompt, and run again without sudo:

singularity run -o testoverlay.img centos-7.8.2003.sif

And you will be able to check that now we got 2 "vim" installed: vim-minimal vim-enhanced

Now try running the command without -o testoverlay.img and you will see that you only have: vim-minimal

In summary: the overlay need to be done as a ext3 filesystem to make it portable, and you will need to make all the changes onto the overlay.img on the system that you have "sudo" or root before moving it into the HPC location that you intended to run.

Hope this help you somehow.

Thank you. Warmest regards, Ser Heang TAN

On Wed, Apr 7, 2021 at 12:57 PM Ser Heang Tan @.***> wrote:

Yes, This is correct. without the overlay, the / will be back to whatever in the sif.

As for your other question, I will try it out and let you know.

Thank you. Warmest regards, Ser Heang TAN On 3 Apr 2021, 11:58 PM +1100, Amir Arsalan Soltani < @.***>, wrote:

Also, assuming that overlay can make it possible for multiple users to write into /usr, I'm thinking overlay gives the illusion of making changes into /usr and the changes made by each user will not be saved into the container permanently. Is that correct?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hpcng/singularity/issues/5903#issuecomment-812862071, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOGZO67LHOBBTQJSP3UG5TTG4GHRANCNFSM42JVFSNA .

Amir-Arsalan commented 3 years ago

@serheang Thank you so much for taking the time to write this. This is very helpful. However, I think ideally people would like to do everything on the HPC side, and not to have to keep moving the container (or the overlay image) back and forth between their machines and the cluster. I think it'd be really awesome if the Singularity developers can come up with a solution for this @dtrudg @cclerget

serheang commented 3 years ago

Hi, Sorry that the overlay is not the solution that would help you. Hope others able to provide other solution/idea to your issue.

Thank you. Warmest regards, Ser Heang TAN On 19 Apr 2021, 1:51 AM +1000, Amir Arsalan Soltani @.***>, wrote:

@serheang Thank you so much for taking the time to write this. This is very helpful. However, I think ideally people would like to do everything on the HPC side, and not to have to keep moving the container (or the overlay image) back and forth between their machines and the cluster. I think it'd be really awesome if the Singularity developers can come up with a solution for this @dtrudg @cclerget — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Amir-Arsalan commented 3 years ago

@serheang The current overlay machinery gives a lot of flexibility to users actually. I am just thinking it would be awesome if someone can create a container with sudo and then transfer it to an HPC and allow users to customize it using their own overlay image. This would be an awesome feature.

carterpeel commented 3 years ago

Hello,

This is a templated response that is being sent out to all open issues. We are working hard on 'rebuilding' the Singularity community, and a major task on the agenda is finding out what issues are still outstanding.

Please consider the following:

  1. Is this issue a duplicate, or has it been fixed/implemented since being added?
  2. Is the issue still relevant to the current state of Singularity's functionality?
  3. Would you like to continue discussing this issue or feature request?

Thanks, Carter

Amir-Arsalan commented 3 years ago

@carterpeel

  1. To my knowledge this is not a duplicate issue
  2. Yes
  3. I would say this is more like a feature request
stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had activity in over 60 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

Amir-Arsalan commented 3 years ago

@carterpeel Just wanted to send this message as a reminder for this issue since I haven't got a response from you after your last post.

avolkov1 commented 3 years ago

This would be a nice feature. I would like to be able to run like this:

singularity shell --nv --fakeroot --writable <some_gpu_container.sif>

Running above I can install into /usr/, but cannot use GPUs as those are not mapped in.

nv files may not be bound with --writable

For development, currently I resort to building (with --fakeroot) new containers by bootstrapping from local images and installing what I need. Then I delete these development containers when I'm done.

DrDaveD commented 2 years ago

According to the documentation you should be able to work around this problem by creating bind mount points inside the container for all the bind mounts created by the --nv option.

kmuriki commented 2 years ago

Singularity repo is now retired as the code base is now moved to Apptainer. We are closing all the old issues under the old Singularity repo. For further assistance please open a new issue under the new Apptainer repo. Thanks for your support.