containers / bubblewrap

Low-level unprivileged sandboxing tool used by Flatpak and similar projects
Other
3.97k stars 237 forks source link

Bind mounting over symlinks attempts to create their targets and bind mount over them #390

Open robryk opened 4 years ago

robryk commented 4 years ago

What I observe:

$ touch a_file

$ ln -s a_file rel_symlink
$ bwrap --bind / / --bind a_file $PWD/rel_symlink /usr/bin/env ls -l $PWD/rel_symlink
lrwxrwxrwx 1 robryk users 6 Sep 14 23:55 /home/robryk/zero-k/rel_symlink -> a_file

$ ln -s $PWD/a_file abs_symlink
$ bwrap --bind / / --bind a_file $PWD/abs_symlink /usr/bin/env ls -l $PWD/abs_symlink
bwrap: Can't create file at /home/robryk/zero-k/abs_symlink: No such file or directory

$ ls -l nonexistent
ls: cannot access 'nonexistent': No such file or directory
$ ln -s nonexistent broken_symlink
$ bwrap --bind / / --bind a_file $PWD/broken_symlink /usr/bin/env ls -l $PWD/broken_symlink
lrwxrwxrwx 1 robryk users 11 Sep 14 23:55 /home/robryk/zero-k/broken_symlink -> nonexistent
$ ls -l nonexistent 
-rw-rw-rw- 1 robryk users 0 Sep 15 00:04 nonexistent

$ ln -s ../../../nonexistent another_broken_symlink
$ bwrap --bind / / --bind a_file $PWD/another_broken_symlink /usr/bin/env ls -l $PWD/another_broken_symlink
bwrap: Can't create file at /home/robryk/zero-k/another_broken_symlink: Permission denied

It seems that:

  1. Mounting over an unbroken symlink caused bwrap to mount over the symlink's target.
  2. Mounting over an unbroken, absolute symlink caused bwrap to fail.
  3. Mounting over a broken symlink caused bwrap to create the symlink target and mount over it.
  4. Mounting over a broken symlink that points into a directory that is not writeable caused bwrap to fail.

What I expect:

Mounting over any kind of symlink to cause the resulting mount namespace to have the symlink replaced with the source of the mount. This did not happen in any of the four cases.

If doing that is impossible (quick skim of man mount(2) didn't yield an equivalent of O_NOFOLLOW, even though umount2 does have the UMOUNT_NOFOLLOW flag), I expect bwrap to exit with an error. I also expect it not to mount anything over the symlink's target. This failed to happen in cases 1 and 3.

It would be nice if the "File not found" error was more descriptive. It's not obvious which file is "missing" (my first intuition was that the problem was with one of the directories in the target path).

smcv commented 3 years ago

Bind mounting over symlinks attempts to create their targets and bind mount over them

Sorry, this is how mount(2) works. We cannot mount anything on a symlink: mount(2) follows symlinks, whether we want it to or not.

robryk commented 3 years ago

It's still confusing to actually do that then, instead of failing. I would wager that this is ~never the desired behaviour. I would prefer if (see the last paragraphs of the original report):

Apteryks commented 2 years ago

Bind mounting over symlinks attempts to create their targets and bind mount over them

Sorry, this is how mount(2) works. We cannot mount anything on a symlink: mount(2) follows symlinks, whether we want it to or not.

My mount (2) manpage says:

MS_NOSYMFOLLOW (since Linux 5.10)

Do not follow symbolic links when resolving paths. Symbolic links can still be created, and readlink(1), readlink(2), realpath(1), and realpath(3) all still work properly.

Is this the feature that could make it work?

smcv commented 2 years ago

Is this the feature that could make it work?

It depends exactly how it works. Perhaps you could try it and see what happens?

Apteryks commented 2 years ago

Hm. I can't see how the behavior changes, adding this option or leaving it out. I guess it won't help here.

robryk commented 1 year ago

Note that MS_NOSYMFOLLOW seems to be an option that affects the created mount (i.e. symlinks that are in that filesystem won't be followed) and not how mount() resolves its target path.

igo95862 commented 11 months ago

Can the new mounting API be leveraged for this?

https://brauner.io/2023/02/28/mounting-into-mount-namespaces.html

It looks like the new syscall open_tree accepts the AT_SYMLINK_NOFOLLOW:

https://github.com/torvalds/linux/blob/968f35f4ab1c0966ceb39af3c89f2e24afedf878/tools/testing/selftests/mount_setattr/mount_setattr_test.c#L1138

The move_mount also has the symlink following flag:

https://github.com/torvalds/linux/blob/968f35f4ab1c0966ceb39af3c89f2e24afedf878/include/uapi/linux/mount.h#L73

smcv commented 11 months ago

Can the new mounting API be leveraged for this?

I don't know, can it? Or would the syscall just fail with ELOOP if the target is a symlink? (Those are the two reasonable interpretations of AT_SYMLINK_NOFOLLOW.)

If the answer is that a new mounting API can mount over the top of an existing symlink without dereferencing the symlink, I'd be happy to review a pull request. It would still need to be able to fall back to the old mounting API to support older kernels.

igo95862 commented 11 months ago

I'd be happy to review a pull request. It would still need to be able to fall back to the old mounting API to support older kernels.

I will try to experiment with the new API. One issue is that it is barely documented. I guess I will read to the kernel source code to figure it out.

igo95862 commented 11 months ago

Actually the mount from util-linux can already use new mount API and has an option to switch between them.

https://github.com/util-linux/util-linux/commit/fd6b4d94ff013dc7ed680d4e864610da5b9751f1

So lets put it to test:

1). Create a non-existent symlink:

>>> ln -s /nonexistent ./test

2). Unshare the mount namespace:

>>> unshare --user --map-root-user --mount

3). Check that it is still symlink.

>>> /usr/bin/ls -l ./
total 0
lrwxrwxrwx 1 igo95862 igo95862 12 Dec  5 23:10 test -> /nonexistent

4). Do a mount over a symlink forcing new api. Mount some existing file.

env LIBMOUNT_FORCE_MOUNT2=never mount --bind /etc/resolv.conf ./test

5). Check what happens.

>>> /usr/bin/ls -l ./
total 4
-rw-r--r-- 1 nobody nobody 73 Dec  3 17:31 test
>>> cat ./test
nameserver 127.0.0.1
nameserver ::1

So it does become a regular file that we mounted on top.

Lets try to use the old mout API:

>>> env LIBMOUNT_FORCE_MOUNT2=always mount --bind /etc/resolv.conf ./test 
mount: ./test: mount point is a symbolic link to nowhere.
       dmesg(1) may have more information after failed mount system call.

So the new API definitely fixes this issue.

Some strace logs.

Using new API:

open_tree(AT_FDCWD, "/etc/resolv.conf", OPEN_TREE_CLONE|OPEN_TREE_CLOEXEC) = 3
move_mount(3, "", AT_FDCWD, "./test", MOVE_MOUNT_F_EMPTY_PATH) = 0

Using old API:

mount("/etc/resolv.conf", "./test", 0x55e0e3605670, MS_BIND, NULL) = -1 ENOENT (No such file or directory)
rusty-snake commented 11 months ago

One issue is that it is barely documented.

You can grab the manpage drafts from the kernel mailing list to get a better starting point.

igo95862 commented 11 months ago

You can grab the manpage drafts from the kernel mailing list to get a better starting point.

Could you please give a link. I am not good with mailing lists.

rusty-snake commented 11 months ago

https://lore.kernel.org/all/15449.1531263162@warthog.procyon.org.uk/

https://duckduckgo.com/?q=site%3Akernel.org+%22Add+manpage+for%22+move_mount&ia=web

igo95862 commented 11 months ago

https://lore.kernel.org/all/15449.1531263162@warthog.procyon.org.uk/

https://duckduckgo.com/?q=site%3Akernel.org+%22Add+manpage+for%22+move_mount&ia=web

Thank you.

Weird that those man pages patches were around since 2018 but never merged.

igo95862 commented 11 months ago

I've rendered the man pages to human readable format:

open_tree

move_mount

Looking at man pages it seems the new API has a few more advantages. For example, it has option of not triggering auto-mount.

If you look at this part of man page looks like new mount API can mount over mount point that does not exist.

rusty-snake commented 10 months ago

FWIW: https://github.com/sunfishcode/linux-mount-api-documentation (@igo95862)


Actually you can do this with the old mount api as well using /proc/self/fd.

diff --git a/bind-mount.c b/bind-mount.c
index 57b4236..d034c51 100644
--- a/bind-mount.c
+++ b/bind-mount.c
@@ -395,7 +395,9 @@ bind_mount (int           proc_fd,

   if (src)
     {
-      if (mount (src, dest, NULL, MS_SILENT | MS_BIND | (recursive ? MS_REC : 0), NULL) != 0)
+      dest_fd = openat(AT_FDCWD, dest, O_CLOEXEC | O_PATH | O_NOFOLLOW);
+      dest_proc = get_oldroot_path (xasprintf ("/proc/self/fd/%d", dest_fd));
+      if (mount (src, dest_proc, NULL, MS_SILENT | MS_BIND | (recursive ? MS_REC : 0), NULL) != 0)
         return BIND_MOUNT_ERROR_MOUNT;
     }
$ ln -s ENOENT hello1.txt
$ echo hello > hello2.txt
$ stat -c '%F' hello1.txt
symbolic link
$ cat hello1.txt
cat: hello1.txt: No such file or directory
$ ./_builddir/bwrap --dev-bind / / --bind $PWD/hello2.txt $PWD/hello1.txt stat -c '%F' hello1.txt
regular file
$ ./_builddir/bwrap --dev-bind / / --bind $PWD/hello2.txt $PWD/hello1.txt cat hello1.txt
hello

[!CAUTION] The PoC diff from above does not

  • handle errors (return code of openat, ...).
  • cares about cleanup (free, close).
  • cares about preserving other flags done below via mountinfo.
  • stop bwrap from creating the target file.