containers / bubblewrap

Low-level unprivileged sandboxing tool used by Flatpak and similar projects
Other
3.75k stars 230 forks source link

"pivot_root: Invalid argument" when running on a SLURM cluster node from NFS #594

Open dmikushin opened 9 months ago

dmikushin commented 9 months ago

When running bwrap within a job of a SLURM cluster node, I get the following error:

$ srun user  
bwrap: pivot_root: Invalid argument

It's highly desirable to let bwrap pass through srun, because this way bwap would also work for the cluster jobs!

smcv commented 9 months ago

a SLURM cluster node

Sorry, I have only the most vague possible idea of what this is. If the only thing the kernel is willing to tell us is "Invalid argument" then it is unlikely to be solvable without someone with root on a suitable system digging into what the exact situation is.

bubblewrap's use of pivot_root is known not to be possible when inside a chroot environment (#135). If SLURM cluster nodes involve running user-supplied code in a chroot or container, there is probably nothing that bubblewrap can do to solve this: if the cluster node does not allow bubblewrap to make the syscalls that it requires to do its job, then we can't do impossible things.

It's highly desirable

I'm sure it is, but that desire doesn't make it possible for bubblewrap to do things that the kernel won't allow.

smcv commented 9 months ago

As a general thing, if there is key information about your system that is unusual and likely to be part of the root cause for an issue, please make sure to mention it in the issue title. Unfortunately it's quite common for the only information available to be pivot_root: Invalid argument, and we don't want people who get that error message for a completely different reason to be jumping onto this issue, because mixing up multiple root causes on one issue report makes it confusing and time-consuming to disentangle.

dmikushin commented 9 months ago

The failing if (pivot_root (base_path, "oldroot")) with base_path="/tmp" gives me a feeling that this setup is optional. So is there a possibility to enable/disable it, along with the related surrounding code?

dmikushin commented 9 months ago

The failure happens even with the minimum possible setup: (gdb) r --bind $HOME/chroot / bash

smcv commented 9 months ago

gives me a feeling that this setup is optional

No, the use of pivot_root is a necessary part of how bubblewrap does what it does: you can't change the root directory without changing the root directory. /tmp was just a convenient path that (we assume) is guaranteed to exist, because we need somewhere to put a temporary mount point during a transitional state while we reorganise the mount point hierarchy.

Old versions of bubblewrap used chroot(2), but that prevented recursive use of bubblewrap (bubblewrap inside bubblewrap), and also led to tools outside the container seeing misleading paths starting with /newroot when inspecting processes inside the container.

dmikushin commented 9 months ago

Old versions of bubblewrap used chroot(2)

Oh, good to know that it actually matches my other result made of a fakeroot/fakechroot pair: there I don't see the need for pivot_root, and in the same SLURM environment they worked well, but are not so nice and new as bubblewrap.

dmikushin commented 9 months ago

I've figured out that bubblewrap works when my root folder is on the local disk. But when I go to the compute node of cluster, this local disk is now mounted via NFS. And this seems to be the actual reason why bubblewrap does not work: it does not like NFS. Is pivot_root() unavailable for NFS or for any network file system in general?

smcv commented 9 months ago

pivot_root has some poorly-documented restrictions, and NFS is quite an unusual filesystem, so it probably doesn't fit one of those restrictions?

dmikushin commented 9 months ago

Yes, however Linux can even boot with NFS root filesystem, this is known to work since classic times. This problem is not well known, I found only one related issue.

rusty-snake commented 9 months ago

FWIW, the documented cases of EINVAL

  • EINVAL new_root is not a mount point.
  • EINVAL put_old is not at or underneath new_root.
  • EINVAL The current root directory is not a mount point (because of an earlier chroot(2)).
  • EINVAL The current root is on the rootfs (initial ramfs) mount; see NOTES.
  • EINVAL Either the mount point at new_root, or the parent mount of that mount point, has propagation type MS_SHARED.
  • EINVAL put_old is a mount point and has the propagation type MS_SHARED.
dmikushin commented 9 months ago

Anyway, I'm going to offer #595 as a workaround to allow smoother behavior on systems with failing pivot_root(). I think bubblewrap is too good to miss it completely due to this issue :)

dmikushin commented 9 months ago

FWIW, the documented cases of EINVAL

Yes, I've checked, nothing here that my case could possibly fall into. In order to learn more, I need to do kernel debugging, which I can't do easily: will need to replicate the entire system locally.