Open sfc-gh-cxie opened 5 months ago
https://github.com/google/gvisor/issues/311 answers your second question.
The --rootless option skips a few security layers that can be setup by root only.
311 answers your second question.
The --rootless option skips a few security layers that can be setup by root only.
Thanks Jing, for some context, I wanna run some arbitrary code inside gVisor, if we run it in root mode and arbitrary code break gVisor, it means it will have the root access, that's why we want to run it in rootless mode.
I wonder what would be the suggested way here, should I use host network mode here(which is less secure from the network perspective), or should I try to work on supporting sandbox mode with rootless.
if we run it in root mode and arbitrary code break gVisor, it means it will have the root access
gVisor re-executes itself as an unprivileged user before starting the container. Therefore, if the code manages to break gVisor, it will not have root access.
When running with sudo
, gVisor only temporarily uses the root privileges in order to create the namespaces and other layers of security (seccomp
rules, pivot_root
, etc) prior to entering the sandboxed mode (dropping all its privileges and changing its own user to unprivileged at that point) and starting the container code. I suggest reading the architecture docs and security model page for more info.
if we run it in root mode and arbitrary code break gVisor, it means it will have the root access
gVisor re-executes itself as an unprivileged user before starting the container. Therefore, if the code manages to break gVisor, it will not have root access.
When running with
sudo
, gVisor only temporarily uses the root privileges in order to create the namespaces and other layers of security (seccomp
rules,pivot_root
, etc) prior to entering the sandboxed mode (dropping all its privileges and changing its own user to unprivileged at that point) and starting the container code. I suggest reading the architecture docs and security model page for more info.
Thanks Etienne, I'm reading the code but seem gVisor only re-execute itself in rootless mode?(https://github.com/google/gvisor/blob/c8da73daaf635e7ad372ed6b49ce68e8ab4010b8/runsc/specutils/namespace.go#L251). But per what you've said, it seems we should always run gVisor in root mode, with more features and extra secure check?
~Hm, this link goes to a private repository?~ Edit: link fixed
The re-execution I am referring to is here, which sets the UID here (and drops all capabilities on the next line).
it seems we should always run gVisor in root mode, with more features and extra secure check?
Yes. Generally speaking, gVisor is intended to take care of doing whatever is the most secure thing to do given whatever privileges it has.
Thanks for updating the link! Yes, the code you've linked to re-runs itself as "root" within a new namespace; this is different from "root" in the initial user namespace, and does not grant the process any additional privilege within the initial user namespace (which is where "root" access would be sensitive).
Moreover, this code executes as part of the runsc run
command, which does not run any container code. The process that runs the actual untrusted container is a subprocess of that: it's the runsc boot
command, which is the one that goes through the codepath with callSelfAsNobody
.
That makes sense, thanks. One more clarification on
gVisor re-executes itself as an unprivileged user before starting the container. Therefore, if the code manages to break gVisor, it will not have root access.
I'm thinking about the case that attacker would escape the container and get the original root privileges. Or are we saying since we've drop all capabilities, even attacker manages escaping from the container, they are unable to get the root privileges either. But should we still have the concern about the file access?
Or are we saying since we've drop all capabilities, even attacker manages escaping from the container, they are unable to get the root privileges either.
Correct. If the code inside the container manages to get the privileges of the runsc boot
process it runs in, all it can do is whatever permissions the runsc boot
process can do. But that process has already dropped all of its privileges before running anything inside the container.
But should we still have the concern about the file access?
Before running any container code, runsc boot
also runs in a separate mount namespace, and calls pivot_root
on itself to remove its own access to any host file that isn't a container root filesystem file or mounted volume.
Additionally, if you use --directfs=false
, the runsc boot
process's seccomp rules will forbid use of the open
and openat
syscall, i.e. it will not be able to open any host file at all.
Description
I'm not sure if it's WAI, is it possible to do runsc run in rootless with sandbox network mode?
Based on my understanding, sandbox network mode is securer but root mode is less securer, is it contradictory? Thanks
Is this feature related to a specific bug?
No response
Do you have a specific solution in mind?
No response