Open rata opened 2 years ago
We'd love to support this. It is, obviously, dependent on support in moby/docker itself, but I'll see if we can help them out at all.
As an update: Kubernetes v1.25 has support for userns with stateless pods and we are aiming to support stateful pods in the coming Kubernetes versions.
@evol262 Any updates from the docker/moby side?
The docker/moby release process got a little hung up around 22.06, which they're sorting through. @neersighted or @corhere, we briefly discussed this what feels like a long time ago now, but how plausible would either of you guess an effort to get somewhat more dynamic --userns
would be?
It's been on the roadmap forever, and with kernels supporting ID-mapped mounts becoming available on LTS distros it is finally becoming practical to implement dynamic user namespaces in Moby. I'd say making it happen is very plausible @evol262.
I wondered more about how the user interface around it may be structured as something which could potentially take a while to sort out, but that detail is less important ;) @rata, @corhere is a maintainer, so that's a solid vote
Hi, we have reworked the k8s implementation to always require idmap mounts.
Since k8s 1.27, the kubelet requests the runtime to use idmappings for the mounts (is part of the mount grpc message). The container runtime should pass these mappings to the OCI runtime and that is basically all to support this.
containerd, runc, CRIO, crun and all are making the changes. It will be great to see this in docker too :)
@neersighted @corhere ^^
Technically simple! But userns remapping in Moby is still pretty limited for the time being. How difficult would this be to expand?
FYI: We are adding support for stateful pods in k8s 1.28, the runtime part is still very simple as it just relies on idmap mounts for the ID handling.
I'm here in case anything is not clear with the KEP or the implementation :)
Hi!
I'm working on the KEP that will be implemented in 1.25 (next k8s release) to support user namespaces. We are creating an implementation for containerd and CRIO, but it will be nice if dockershim implemented that too.
I think there are some limitations docker needs to fix as a pre-requisite for the implementation. IIUC docker only supports a single ID mappings shared by all containers running in the host. There is not support for multiple ID mappings yet. However, for isolation reasons, we are using a different ID mappings for each pod in Kubernetes, which doesn't overlap with mappings of other pods either. So, we will need to use multiple ID mappings for containers, not just a single mapping shared by all containers as docker currently supports.
Some very old comments on the linked moby issue mention that this limitation might be simpler to solve once containerd 1.0 is used, which is already the case. Do you know if this limitation is indeed "easy" to fix now?
It would be great if you can implement userns support for Kubernetes pods in dockershim :)