Open ryantimwilson opened 1 day ago
I really dislike static assignments, it just creates headaches. We should try to get rid of that concept, and not add it to new places.
if you want a full uid mapping then that might be ok, but i'd call it "full". i.e. PrivateUsers=full.
@poettering ack no static assignments definitely would be nicer.
I see 2 implementations:
PrivateUsersRange=0-4294967295
that only takes effect if PrivateUsers=range
PrivateUsers=range:0-42949672951
I prefer a separate field (1) as it is more clear IMHO and easier to support multiple ranges if we need that in the future. Naming could probably be better though...I don't love PrivateUsersRange
What do you think?
Nah, I don#t want static numeric assignments, hence I am voting for a more high-level PrivateUsers=full I must say.
Oh sorry I misread your comment. PrivateUsers=full it is!
Component
No response
Is your feature request related to a problem? Please describe
Meta is migrating to using transient systemd units to start containers. Recently, PrivateUsers=identity was added to support 1:1 mapping of UID/GID in the root namespace: https://github.com/systemd/systemd/pull/34321.
The behavior was only to map the first 65536 UID/GIDs and > 65536 is mapped to nobody. This makes the behavior identical to nspawn.
However, this does not work for Meta because we have lots of UID/GIDs and need to map all UIDs 1:1. So it would be useful to have a uid_map like
0 0 4294967295
.But the kernel in the init namespace uses a default uid_map of
0 0 4294967295
: https://man7.org/linux/man-pages/man7/user_namespaces.7.html. And systemd detects whether its in a non-init usernamespace by checking the value of uid_map !=0 0 4294967295
: https://github.com/systemd/systemd/blob/893aa45886ef84b1827445dc438e410ad89fbbbf/src/basic/virt.c#L851Thus, Meta actually uses a UID file like:
This ends up mapping all UIDs 1:1 up to 2^32 - 1 but also ensures systemd's running_in_userns() returns true.
Describe the solution you'd like
I see a few possible approaches:
PrivateUsers=0,1:4294967295
. Unlike nspawn, we would have to understand multiple ranges or systemd needs another way to detect we're in a non-init user namespace.PrivateUsers=identity-all
to map all UID/GIDsPrivateUsers=identity
to map all UIDs/GIDsI mildly prefer option 2 with option 1 as a close second.
1 will certainly take the most implementation work re: parsing but is more extensible and consistent with nspawn.
2 is simpler to implement but is nice because it hides the nasty uid_map workaround.
3 is inconsistent with nspawn, don't like it.
Describe alternatives you've considered
For testing the new container runtime, one of our developers worked around this by hot patching systemd to do option 3 above. But we'd prefer not doing this.
The systemd version you checked that didn't have the feature you are asking for
257