containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.14k stars 2.36k forks source link

Rootless Install Guide #3195

Closed Snapstromegon closed 4 years ago

Snapstromegon commented 5 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind feature

Description

I don't know if this is even possible, but assume I'm on a host that doesn't allow me to run anything as root:

Is it possible to setup podman rootless (this is allowed to include compiling from scratch) and can e.g. the setup guide include some help for this?

Even if it isn't possible to setup podman completely without root, it would be great to have something like a checklist what root has to provide for the setup to work.

My usecase is, that podman is strong by removing the root deamon (compared to the evil d-word) and so it's a good fit for e.g. shared hosting where the user has no root but wants to run containers. My hoster might do some minimal software tweaking, but wouldn't install anything system wide.

Maybe there is also a way easier way to solve my problem of installing podman rootless or a even a guide to it and I was just not able to find it myself...

Additional environment details (AWS, VirtualBox, physical, etc.):

Shared Hosting running Centos 7

rhatdan commented 5 years ago

Right now it requires the shadow-utils with newuidmap and newgidmap. I think the rest of the packages could be hacked together to run. But this would be a good exercise. You would probably need fuse-overlay and slirp4netns either installed on the system or built locally, then you would need to customize the storage.conf file in ~/.config/containers/storage.conf.

mheon commented 5 years ago

If you have access to a pre-built Podman binary (and Conmon and slirp4netns), and install them into your home directory, that should work fine (as long as newuidmap and newgidmap are available). If you want to build from source, that's more complicated, because we have a good number of build dependencies that you'd have to produce from scratch.

rhatdan commented 5 years ago

I believe that we can make the code work with a single UID without newuidmap and newgidmap.
You can execute unshare -r and enter a User Namespace with a single UID. You would only be able to run/build containers with a single UID, but this might be a useful use case. I have a lot of people from the HPC groups looking for this kind of solution.

@giuseppe ^^

Snapstromegon commented 5 years ago

okay, I'll see if something like this works for my case - but nevertheless it might be good to mention something like @rhatdan's comment in the install guide.

TomSweeneyRedHat commented 5 years ago

+1000 I literally was looking through our docs today and didn't find much if anything. I was thinking about putting together a tutorial for this, but thought we'd one somewhere. We should have at least a tutorial in GitHub that goes through the setup process for shadow-utils, /etc/subuid, /etc/subgid, and a bit more. I'll see if I can hack together a least a rough tutorial that we can expand on.

Snapstromegon commented 5 years ago

So just that I understand correctly: According to @rhatdan it should be possible to have a mode that doesn't need shadow-utils at the cost of having only one uid, but after that according to @mheon it would be possible to just grab the binarys from e.g. a local system and use them completely rootless?

This would just be a dream for many shared hosting clients!

mheon commented 5 years ago

I will caution that a lot of standard images require more than 1 UID to work, and thus will need a working shadowutils. You won't be able to use the Fedora, Busybox, Ubuntu, for starters.

giuseppe commented 5 years ago

That is already possible. We raise only a warning if podman cannot setup multiple IDs. You should be fine to run it but as @mheon pointed out, many images will fail

cyphar commented 5 years ago

One way to solve the single-UID problem is to look into using the xattr trick that umoci uses to store the real uid/gid. You then fake the syscall responses to match. Unfortunately right now this requires PRoot (which uses ptrace) but post-5.0 kernels have new SECCOMP_RET_USER_NOTIF support which means we can fake it using seccomp and thus without the downsides of ptrace.

I have looked at rewriting the important bits of PRoot to use SECCOMP_RET_USER_NOTIF but have been busy with other yak shaving over the past few months.

(As an aside, one of the reasons I've become a bit unhappy -- though still am impressed with stuff like slirp4netns -- with the direction of rootless containers is that usecases like the one in this thread are no longer as straightforward as I envisioned -- when the whole point was to be able to avoid setuid entirely. There's a reason runc only falls back to the setuid binaries.)

giuseppe commented 5 years ago

One way to solve the single-UID problem is to look into using the xattr trick that umoci uses to store the real uid/gid. You then fake the syscall responses to match. Unfortunately right now this requires PRoot (which uses ptrace) but post-5.0 kernels have new SECCOMP_RET_USER_NOTIF support which means we can fake it using seccomp and thus without the downsides of ptrace.

this is something I also wanted to play with in crun (https://github.com/giuseppe/crun/issues/22) but had no time yet to look at. Are you planning to emulate only file system permissions or also mask syscalls like seteuid(2)? For file system permissions another alternative, at least for rootless Podman that is already using it, is to teach fuse-overlayfs to read and use these xattrs

cyphar commented 5 years ago

PRoot does mask syscalls like seteuid(2) already (as does my hacky project remainroot) so we might as well. The really big issue is setgroups and getgroups -- because unmapped UIDs show up and you cannot drop them apt likes to throw a hissy fit as do a lot of other tools

rhatdan commented 5 years ago

@cyphar @giuseppe If we could get this supported in Podman, that would be a nice step forward. Would love to see some PRs moving this way.

giuseppe commented 5 years ago

@cyphar @giuseppe If we could get this supported in Podman, that would be a nice step forward. Would love to see some PRs moving this way.

we need a monitor for doing the seccomp stuff. So we should either add these capabilities to conmon, which I think is not really the correct place (unless using directly libcrun?) or the OCI runtime. In the latter case we need to change conmon to wait for the OCI runtime rather than the container process.

cyphar commented 5 years ago

And the problem with doing it in runc is that we'd now break the "no daemons for our containers" model...

giuseppe commented 5 years ago

And the problem with doing it in runc is that we'd now break the "no daemons for our containers" model...

it will be opt-in, or just work in the runc run mode.

cyphar commented 5 years ago

We can definitely do it, I was just thinking out loud. I'd prefer it to be more pluggable where we provide a way for SECCOMP_RET_USER_NOTIF to be combined with the seccomp configuration.

baude commented 5 years ago

updates fellas?

TomSweeneyRedHat commented 5 years ago

I'm not sure if it covers the request completely, but we put this up about a month ago https://github.com/containers/libpod/blob/master/docs/tutorials/rootless_tutorial.md It's mostly likely a start to a users guide and some of the doc folks had talked about using it as a seed for a more formal users guide.

cyphar commented 5 years ago

That is an improvement, though I would like to sit down and try to see what breaks if I try to run an actually rootless environment a-la the original case-study for rootless containers (no setuid helpers or configuration of root-only files at all). Hopefully there wouldn't be too many things that break, but given that this is a use-case that modern "rootless" containers don't care too much about I wouldn't be shocked if there are a few things that don't work correctly.

rhatdan commented 5 years ago

I just merged into containers/storage a fix that allow ignoring of chown failures. Setting this should allow a single UserNamespace UID to download and work with images. Pushing them somewhere might be an issue.

Work was asked for by some HPC people.

vrothberg commented 4 years ago

Doing some cleanups in the issues; updates for this one?

TomSweeneyRedHat commented 4 years ago

I think we've just been nibbling at the tutorial as time has passed. Currently, @rhatdan @giuseepe @baude and others are working through F31/crun issues and we'll probably need an update for this afterward.

rhatdan commented 4 years ago

Yes we need to do some editing on this for cgroupsV2.

github-actions[bot] commented 4 years ago

This issue had no activity for 30 days. In the absence of activity or the "do-not-close" label, the issue will be automatically closed within 7 days.

mheon commented 4 years ago

I think we have another issue already open for this?

mheon commented 4 years ago

3932