Open robsmith11 opened 7 years ago
setuid will be an issue, but this particular error is because /dev/fuse is missing in the /dev in the sandbox.
Ah, you're right. Adding /dev/fuse allows encfs to run as a root user and as non-root I get:
fusermount: mount failed: Operation not permitted
Is it possible to enable setuid support? Or would that go against the design goals of bwrap?
No, setuid is not possible. We use the PR_SET_NO_NEW_PRIVS prctl() to make filesystem namespaces safe. Without it you could use bwrap to fool a setuid-on-host binary to load or modify the wrong file.
Ok, thanks. I'll have a go at removing it in local branch then.
I think we could design a mechanism to run binaries (including setuid) after the mount namespace is set up. Something like bwrap --exec sshfs user@hostname:/path /newroot/mnt \;
, where we run the binaries from the host (as the user), but have /newroot
mounted in a new mntns.
Offhand...it seems like it'd be safe to move the NNP invocation until the final execve()
.
Offhand...it seems like it'd be safe to move the NNP invocation until the final execve().
SECCOMP_SET_MODE_FILTER
requires CAP_SYS_ADMIN
or NNP to be set, so it's not quite as simple.
Are there any news in this regard? I am having the same troubles with fusermount
. Would it e.g. be possible to have an exception for this case to run in "less-secure" mode or something like this?
I also would like to know if there is any progress with this.
Since electron6 there is a sandbox mode by default which requires the ability to run suid files. This will therefore affect a number of applications. Are there any plans for an option to allow this? Using bubblewrap in combination with electron apps allow for more strict sandboxing.
The way the linux kernel supports unprivileged user namespace requires the use of PR_SET_NO_NEW_PRIVS which makes setuid not work. This is not something we can change from userspace, nor would it be secure sandbox if we could.
For flatpak we're doing work to support the chrome sandbox by talking to a service on the host that then spawns a new bubblewrap instance with different settings. I think that is the only way something like this can be done with the current kernel APIs.
After playing a little bit with the kernel.unprivileged_userns_clone flag I was wondering if this could be useful.
When running a kernel with unprivileged user namespaces disabled, a suid bwrap cannot spawn other sandboxes which use user namespaces. But a suid bwrap could run sysctl kernel.unprivileged_userns_clone=1
, drop privs and start the second sandbox/user namespace. This is an ugly solution of course but it won't create more of an attack surface then any kernel with unprivileged user namespaces enabled.
The ideal solution I could think of would be a capability to allow the use of user namespaces for the enabled binaries only. This would allow sandbox applications like bubblewrap to run without suid nor with unprivileged user namespaces exposed to all userspace application like most distributions do today. Has this already been discussed somewhere? Is there an obvious reason why this has not been implemented? Does it make sense to propose this on the kernel mailing list?
But a suid bwrap could run
sysctl kernel.unprivileged_userns_clone=1
, drop privs and start the second sandbox/user namespace
That would allow any process, including potential attackers, to create new user namespaces. If the distribution's kernel maintainer considers unprivileged user namespaces to be an unacceptable security risk, then they would consider the bwrap
that did this to be introducing a security flaw into the system.
We want distributions to install bwrap
- setuid, if necessary - so that programs and libraries can rely on being able to use it. Installing a setuid executable is an act of trust. If bwrap
behaves in an untrustworthy way, then security-conscious distributions will stop installing it setuid (or at all), and everything that relies on it will stop working.
The ideal solution I could think of would be a capability to allow the use of user namespaces for the enabled binaries only.
Part of what is necessary to make an executable setuid (or setcap) is trusting that it will not let the user who ran it abuse its elevated privileges. bubblewrap achieves this by using the NO_NEW_PRIVS
prctl to give up its ability to gain privileges in the future - after that prctl has been used, executing setuid binaries does not give you the privileges you would usually get.
This is a required part of making a setuid bwrap not be a security flaw. If it didn't give up privileges, then you could use it to subvert any program that trusts particular filesystem paths to have non-malicious contents - in particular, something like bwrap --dev-bind / / --bind ~/sudoers /etc/sudoers -- sudo -s
would be root privilege escalation.
The ideal solution would be for distribution maintainers to be confident that allowing unprivileged creation of new user namespaces is not going to introduce a security vulnerability. Unfortunately, according to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=898446, as of May 2018 disabling unprivileged creation of user namespaces was neutralising a root privilege escalation "every month or so".
I don't know whether this is still the case. If it isn't, then perhaps distributions like Debian would be willing to consider changing the default (while still allowing the feature to be disabled for "hardened" systems that do not need to use Flatpak or unprivileged LXC), like Ubuntu did.
Thanks for the comment. Regarding the first idea with setting the ns flag, I agree.
In regards to the capability option I am not sure if I am misunderstood.
There are two common options and one proposed idea:
Kernel permits the use of unprivileged user namespaces, allowing bubblewrap to work as unprivileged sandbox with the option to spawn new sandboxes inside an existing one using user namespaces. Everything works, but the unprivileged user namespace hell is open for all processes.
Kernel does not permit unprivileged user namespaces. Bubblewrap is suid, allowing to run user namespaces then drops privs. Cannot run other sandboxes inside. Limited functionality, suid risk involved in every application that needs user namespaces (like chrome-sandbox).
(Idea) Kernel does not generally permit unprivileged user namespaces. user namespace capability is used on all sandbox tools (bwrap, chrome-sandbox etc.) allowing bwrap to run as unprivileged* process with the option to not drop this capability to allow further sandboxes inside to use unprivileged user ns. Everything works, but neither suid nor unprivileged user namespace hell for all processes is needed.
From my perspective the 3. option has the smallest attack surface. Even if bwrap would not allow to not drop its user ns capability, preventing new/stacked sandboxes, it would reduce the suid risk (for all sandboxing applications) to the attack surface of running user ns, which smaller then suid in any case. Even better, it would allow all distributions to drop support for unprivileged user namespaces, reducing their attack surface a lot.
user namespace capability is used on all sandbox tools (bwrap, chrome-sandbox etc.) allowing bwrap to run as unprivileged* process with the option to not drop this capability to allow further sandboxes inside to use unprivileged user ns.
Let's suppose I'm an attacker, and I want to exploit a kernel bug that is exposed by the ability to create a new userns, and use it to get root privilege escalation. In your proposed system, I'm going to choose to run a bwrap
sandbox. Inside that sandbox, I'll make use of the proposed capability that I've been given, which lets me create a new userns and carry on with my attack.
The defender has to win every time, the attacker only has to win once. If an attacker can get a desired capability, then that's essentially equivalent to the attacker always having that capability, because they will choose to do whatever is necessary to get it.
Thinking about this the option 3 is only secure if the caps are dropped by default, which is probably why this is what bwrap does ... This still reduces the risk the most, even with limited features.
the option 3 is only secure if the caps are dropped by default
No, it's only secure if they are always dropped. If an attacker can choose whether or not to drop capabilities, then obviously they'll choose not to.
(The attacker we're thinking about in this threat model is an ordinary unprivileged user, with arbitrary code execution as their own unprivileged uid, who wants to escalate from their own uid to root; so they can freely choose what arguments they pass to bwrap
.)
user namespace capability is used on all sandbox tools (bwrap, chrome-sandbox etc.) allowing bwrap to run as unprivileged* process with the option to not drop this capability to allow further sandboxes inside to use unprivileged user ns.
Let's suppose I'm an attacker, and I want to exploit a kernel bug that is exposed by the ability to create a new userns, and use it to get root privilege escalation. In your proposed system, I'm going to choose to run a
bwrap
sandbox. Inside that sandbox, I'll make use of the proposed capability that I've been given, which lets me create a new userns and carry on with my attack.The defender has to win every time, the attacker only has to win once. If an attacker can get a desired capability, then that's essentially equivalent to the attacker always having that capability, because they will choose to do whatever is necessary to get it.
Yes, I noticed 5 seconds later :)
In any case, with all caps dropped there is still the same functionality the with the suid option but with less risk.
Part of the problem here is that some of the upstream Linux kernel developers consider user namespaces to be safe (except for the attacks they enable, which are considered to be kernel bugs to be fixed), and think they should just be enabled all the time. Other upstream Linux kernel developers think user namespaces are fundamentally unsafe (but don't want to stand in the way of the feature existing, and being used on systems with no untrusted users), and so think security-conscious distros/sysadmins/users should disable it in their kernel configuration. This is fine if everyone compiles their own kernel configured to reflect their own security tradeoffs, but that isn't how binary distributions work, and distributions have to somehow ship one kernel that is suitable for both points of view.
The whole kernel.unprivileged_userns_clone
mechanism is a downstream patch (initially from Ubuntu, now used by Debian and a non-default Arch Linux kernel) to try to make the decision a bit less permanent, by at least doing it at runtime instead of at compile-time. Without that patch, distros like Debian would have to disable the feature completely, with no way to enable it at runtime. However, as far as I'm aware, there has been no progress in getting it upstream, because the faction of kernel developers who think user namespaces are safe reject it.
Adding more capabilities seems likely to suffer the same fate. Also, binaries with elevated capabilities are like a "mini-setuid" - they have to be just as careful about what they're willing to do based on user input as setuid binaries do, because if they aren't, the result would be equivalent to giving the capability to everyone, which would defeat the point of it being a capability.
Thanks for explaining.
If I try to mount a encfs folder inside bwrap, I get the error:
fuse: device not found, try 'modprobe fuse' first
The fuse module is loaded, and encfs works fine when used outside of bwrap.EDIT: I'm guessing this has something to do with encfs using
fusermount
, which has the setuid bit.