Open Michcioperz opened 5 years ago
pivot_root(2):
[...] the implementation of pivot_root() may change in the future. At the time of writing, pivot_root() changes root and current working directory of each process or thread to new_root if they point to the old root directory. This is necessary in order to prevent kernel threads from keeping the old root directory busy with their root and current working directory, even if they never access the filesystem in any way. In the future, there may be a mechanism for kernel threads to explicitly relinquish any access to the filesystem, such that this fairly intrusive mechanism can be removed from pivot_root().
This means that using pivot_root
alone is not enough, as it may break in the future.
chroot
alone is not enough, because it leaves the old root mounted in our mount namespace, which makes it possible to escape the chroot eg. with another call to chroot
or pivot_root
. This requires CAP_SYS_ADMIN
, which the sandboxed does not have, but for the sake of defense-in-depth, we want to stay secure even if the sandboxed process somehow acquires CAP_SYS_ADMIN
.
TODO: verify the above claim experimentally
The manpage for pivot_root(2)
says that we should chroot
before pivot_root
instead of after. We should verify if this works as good or better than what we do right now. We should also check how other tools use pivot_root
, considering that future changes to the kernel will probably take into account what popular existing tools do.
Apologies for my earlier laconic explanation.
What I meant was that pivot_root(2) man page recommends chdir("/")
and not chroot.
By the way, pivot_root(2):
The following restrictions apply to new_root and put_old:
- They must be directories.
- new_root and put_old must not be on the same filesystem as the current
root.
- put_old must be underneath new_root, that is, adding a nonzero number of
/.. to the string pointed to by put_old must yield the same directory as
new_root.
Just found out lxc does the pivot_root thing:
https://github.com/lxc/lxc/blob/4faaed332b99e60c0e1eed7dcc697f4fbf4f9441/src/lxc/conf.c#L1496
I still don't approve of this personally.
What I meant was that pivot_root(2) man page recommends chdir("/") and not chroot.
Yeah, but pivot_root(8) does. I'm not sure why there is a difference, will need to look into that.
Just found out lxc does the pivot_root thing: [...] I still don't approve of this personally.
I am aware that the use of .
as old_root is not compliant with the manpage, that it is tricky to understand, and that it is ugly. However, this way we don't need an empty directory in the new root.
Because we allow arbitrary filesystem to be bind-mounted on /
, and because it can be mounted read-only, and mounted concurrently by multiple instances of sio2jail, we'd need to make sure every such filesystem contains said empty directory from the very beginning, and fail if it doesn't.
This would break some things, require changes elsewhere in the system, and AFAIK would not make any observable behaviour better. Unless I'm missing something.
I guess a little uncompliance is fine given how often we update the kernel on prod
Yeah, but pivot_root(8) does.
I can't help but notice we're not using pivot_root(8) here.
I'm not sure why there is a difference, will need to look into that.
One reason I can think of is that pivot_root(8) is a binary and its execution of chdir("/")
is futile since it doesn't exec
at the end. But assuming this first-result-on-ddg repo is a straight mirror of upstream:
this seems to be for the reason mentioned and out of convenience, to break up whatever Links to the Past are left.
https://github.com/sio2project/sio2jail/blob/184eda34f43e2a82acc3647e14515130185d040e/src/ns/MountNamespaceListener.cc#L66-L67
no.