error creating libpod runtime: there might not be enough IDs available in the namespace

juansuerogit commented 5 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

I have RHEL servers in the 7.x range ( i think they are 7.4 or 7.5 ) that we currently run containers on with docker-compose. Went to a Red Hat conference and learned about Podman so want to use Podman in production to help us get away from the big fat deamons and not to run containers as root.

To that end i have created a centos 7.5 VM on my laptop and installed podman. But i cannot seem to get the uidmap functionality to work.

Im hoping that once we solve this uidmap bug im encountering that we can then take this and run it on RHEL 7.4 server.

On the RHEL 7.4 we can only operate as a regular user so we need to figure out rootless podman.

I understand that some changes to the OS are needed and we need adminstrative control to do this. Like the subuid and subgid and the kernal params to enable user namespaces. we can do that. but on a day to day basis including running the production containers we have to be able to run rootless podman and backup and recover the files as the same regular user ( not root )

In addition im not sure how to map an existing user on the container image for example mongod ( the mongodb user ) to the regular server user. but thats maybe getting ahead of ourselves.

Steps to reproduce the issue:

clean Centos 7.5 VM
logged into a regular user called "meta" (not root)
sudo grubby --args="namespace.unpriv_enable=1 user_namespace.enable=1" --update-kernel="/boot/vmlinuz-3.10.0-957.5.1.el7.x86_64"
sudo yum -y update && sudo yum install -y podman
sudo echo 'user.max_user_namespaces=15076' >> /etc/sysctl.conf
sudo echo 'meta:100000:65536' >> /etc/subuid
sudo echo 'meta:100000:65536' >> /etc/subgid
sudo reboot
podman run -dt --uidmap 0:100000:500 ubuntu sleep 1000

Describe the results you received:

Error: error creating libpod runtime: there might not be enough IDs available in the namespace (requested 100000:100000 for /home/meta/.local/share/containers/storage/vfs): chown /home/meta/.local/share/containers/storage/vfs: invalid argument

Describe the results you expected:

I expected a pod / container which would be running and i could exec into it and create files inside the container as user root

upon exiting the container i expect those files to be owned by user "meta"

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Version:            1.3.2
RemoteAPI Version:  1
Go Version:         go1.10.3
OS/Arch:            linux/amd64

Output of podman info --debug:

WARN[0000] using rootless single mapping into the namespace. This might break some images. Check /etc/subuid and /etc/subgid for adding subids 
debug:
  compiler: gc
  git commit: ""
  go version: go1.10.3
  podman version: 1.3.2
host:
  BuildahVersion: 1.8.2
  Conmon:
    package: podman-1.3.2-1.git14fdcd0.el7.centos.x86_64
    path: /usr/libexec/podman/conmon
    version: 'conmon version 1.14.0-dev, commit: e0b5a754190a3c24175944ff64fa7add6c8b0431-dirty'
  Distribution:
    distribution: '"centos"'
    version: "7"
  MemFree: 410226688
  MemTotal: 3973316608
  OCIRuntime:
    package: runc-1.0.0-59.dev.git2abd837.el7.centos.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.0'
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 4
  hostname: min0-kube0
  kernel: 3.10.0-957.21.3.el7.x86_64
  os: linux
  rootless: true
  uptime: 2h 25m 41.8s (Approximately 0.08 days)
registries:
  blocked: null
  insecure: null
  search:
  - registry.access.redhat.com
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.centos.org
store:
  ConfigFile: /home/meta/.config/containers/storage.conf
  ContainerStore:
    number: 0
  GraphDriverName: vfs
  GraphOptions: null
  GraphRoot: /home/meta/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 0
  RunRoot: /tmp/1000
  VolumePath: /home/meta/.local/share/containers/storage/volumes

Additional environment details (AWS, VirtualBox, physical, etc.):

Centos 7.5 VM
sudo yum -y update && sudo yum install -y podman sudo echo 'user.max_user_namespaces=15076' >> /etc/sysctl.conf sudo echo 'meta:100000:65536' >> /etc/subuid sudo echo 'meta:100000:65536' >> /etc/subgid sudo reboot podman run -dt --uidmap 0:100000:500 ubuntu sleep 1000

mheon commented 5 years ago

--uidmap 0:100000:500 looks like the problem. You're requesting to map to UID 1000000 with rootless Podman (I'm presuming that last Podman command in your reproducer is run without sudo).

You don't need to use --uidmap with rootless Podman - we'll automatically select the UID/GID ranges from subuid and subgid. You only need the uidmap flag if you want to change the way users are allocated within the container (for example, by default, the user launching Podman is mapped into the rootless container as UID 0 - you can change that with a few --uidmap args).

Just running Podman as a non-root user, no extra arguments or special flags (but with a configured /etc/subuid and /etc/subgid), is enough to launch your containers inside an unprivileged user namespace.

mheon commented 5 years ago

And to provide further clarity on why it fails - --uidmap is trying to map to UID 1000000, which is not mapped into the container. The container only has 65536 UIDs from the ranges in /etc/subuid and /etc/subgid (plus one more - the UID/GID of the user that launches it). Mapping to UID 1000000 and higher won't work, since we don't have any UIDs higher than 65536 available.

mheon commented 5 years ago

Depends on how you want to use it... There's no requirement that the user running in the container must match the user who ran Podman. However, if you have volumes in the container, and you need to access them from the host, you generally will need to ensure the UIDs match. (Alternatively, you can use podman unshare to get a shell with UID/GID mappings matching the rootless container).

Technically, you'll also need 3 UID maps... One for UIDs below 23, one for 23 itself, one for UIDs about 23. Because of this, we generally recommend just running the service in the container as UID 0 - it's not really root, it's the user that launched the container, so you don't give up anything in terms of security.

juansuerogit commented 5 years ago

ok thanks that got me past that error but now im running rootless and getting image related errors.

podman run -v /home/meta/backup:/root/backup -dt docker.io/centos:latest sleep 100

note: im using the fully qualified path here because without it i get another type of error. and further more i cant seem to draw from the my companies registry either even though im docker logged in via their tools. but currently stuck at this error.......

podman run -v /home/meta/backup:/root/backup -dt docker.io/centos:latest sleep 100

WARN[0000] using rootless single mapping into the namespace. This might break some images. Check /etc/subuid and /etc/subgid for adding subids Trying to pull docker.io/centos:latest...Getting image source signatures Copying blob 8ba884070f61 done Copying config 9f38484d22 done Writing manifest to image destination Storing signatures ERRO[0026] Error while applying layer: ApplyLayer exit status 1 stdout: stderr: there might not be enough IDs available in the namespace (requested 0:54 for /run/lock/lockdev): lchown /run/lock/lockdev: invalid argument ERRO[0026] Error pulling image ref //centos:latest: Error committing the finished image: error adding layer with blob "sha256:8ba884070f611d31cb2c42eddb691319dc9facf5e0ec67672fcfa135181ab3df": ApplyLayer exit status 1 stdout: stderr: there might not be enough IDs available in the namespace (requested 0:54 for /run/lock/lockdev): lchown /run/lock/lockdev: invalid argument Failed Error: unable to pull docker.io/centos:latest: unable to pull image: Error committing the finished image: error adding layer with blob "sha256:8ba884070f611d31cb2c42eddb691319dc9facf5e0ec67672fcfa135181ab3df": ApplyLayer exit status 1 stdout: stderr: there might not be enough IDs available in the namespace (requested 0:54 for /run/lock/lockdev): lchown /run/lock/lockdev: invalid argument

mheon commented 5 years ago

WARN[0000] using rootless single mapping into the namespace. This might break some images. Check /etc/subuid and /etc/subgid for adding subids

There's your problem.

Do you have newuidmap and newgidmap binaries installed?

juansuerogit commented 5 years ago

no the directions at https://github.com/containers/libpod/blob/master/install.md didnt say to do this

cat /etc/centos-release CentOS Linux release 7.6.1810 (Core)

shall i follow these directions ? https://www.scrivano.org/2018/10/12/rootless-podman-from-upstream-on-centos-7/

juansuerogit commented 5 years ago

In addition when i create the directory manually i cannot exec into the container...

after running mkdir ./backup and then podman run -v /home/meta/backup:/root/backup -dt docker.io/centos:latest sleep 100

the container can be seen as running with e1516b7986b9 docker.io/library/centos:latest sleep 100 3 seconds ago Up 2 seconds ago nervous_williamson

but when i try to exec..

podman exec -ti -l bash exec failed: container_linux.go:345: starting container process caused "process_linux.go:91: executing setns process caused \"exit status 22\"" Error: exit status 1

mheon commented 5 years ago

@giuseppe Any idea about that exit status out of runc? Sounds like something we might have fixed in a more recent version.

mheon commented 5 years ago

RE: the Docker issue - I'll look into this tomorrow. If we're not matching Docker, that's definitely a bug.

juansuerogit commented 5 years ago

thanks, ill check back tomorrow sometime. fyi my requirement is to be able to run rootless here is docker version... not sure if they are clashing. i didnt install runc or anything else

docker version Client: Version: 18.09.6

podman version Version: 1.3.2

rhatdan commented 5 years ago

We explicitly decided not to follow Docker on this one. Creating a bind mount volume on the host when it does not exist. I believe that this is a bug in Docker, since it could lead to user typos, being ignored an unexpected directories/volumes being created.

rhatdan commented 5 years ago

There are other flags in the kernel that need to be set to use User Namespace on RHEL7/Centos 7. @giuseppe PTAL

giuseppe commented 5 years ago

I see different issues here. The blog post I've written some time ago seems outdated, I'll need to write another one.

So the first thing: newuidmap/newgidmap seems to be missing, you'll need to install them, or most images won't work (same issue as https://github.com/containers/libpod/issues/3423).

You need to update runc, since the version you are using has different issues with rootless containers, .e.g. it will complain about gid=5 using an unmapped UID even though that UID is present in the user namespace.

Currently upstream podman is broken for RHEL 7.5, the issue is being addressed with https://github.com/containers/libpod/pull/3397

llchan commented 5 years ago

I have podman working on my normal host, but today when I went to try it on a different host I saw the "not enough IDs available" error mentioned here. I must be forgetting a step that I ran on the other host, so if we could put together a pre-flight checklist that would be helpful. Off the top of my head here are the things I checked:

newuidmap/newgidmap exist on PATH (version 4.7)
runc exists on PATH (version 1.0.0-rc8)
slirp4netns exists on PATH (version 0.3.0)
conmon exists on PATH (version 1.14.4)
/proc/sys/user/max_user_namespaces is large enough (16k)
/etc/subuid and /etc/subgid have enough sub ids (64k, offset by a large number)
$XDG_RUNTIME_DIR exists
I ran podman system migrate to refresh the pause process

What am I forgetting? Is there something I can run to pinpoint the issue?

mheon commented 5 years ago

Is the image requesting an ID over 65k? Some images do include UIDs in the million range - those can break even for properly configured rootless.

llchan commented 5 years ago

I don't think so, it said (requested 0:42 for /etc/shadow) for the alpine:latest I was testing with.

rhatdan commented 5 years ago

Does podman unshare work?

podman unshare cat /proc/self/uid_map

llchan commented 5 years ago

Yes, I think so:

$ podman unshare cat /proc/self/uid_map
         0      12345          1

unshare -U also appears to work.

rhatdan commented 5 years ago

That indicates that the user executing podman unshare only has one UID 12345 I would guess that /etc/subuid does not have an entry for user 12345 USERNAME.

llchan commented 5 years ago

Did a bit more snooping, looks like the podman log level is not set early enough, so the newuidmap debug output is getting swallowed. I built a binary with that log level bumped up and this is the error that causes the issue:

WARN[0000] error from newuidmap: newuidmap: open of uid_map failed: Permission denied

mheon commented 5 years ago

Permissions issue on the binary?

mheon commented 5 years ago

I'll tag @giuseppe in case it isn't that - he might have some ideas

llchan commented 5 years ago

Binary is readable/executable and runs fine, but it looks like it's owned by a user other than root:root (we deploy packages differently to that host). Is it required for it to be root:root to do its magic?

Also, is there any way to detect that the newuidmap version is too old? I have a colleague who ran into an issue with his PATH so it was falling back to the system newuidmap, and something other than an EPERM would have been nice.

giuseppe commented 5 years ago

Binary is readable/executable and runs fine, but it looks like it's owned by a user other than root:root (we deploy packages differently to that host). Is it required for it to be root:root to do its magic?

yes, newuidmap/newgidmap must be owned by root and it must either have fcaps enabled or installed as setuid.

juansuerogit commented 5 years ago

So long story short I need to use RHEL 8?

giuseppe commented 5 years ago

So long story short I need to use RHEL 8?

that will surely help as all the needed pieces are there, including an updated kernel where you can use fuse-overlayfs.

rhatdan commented 5 years ago

getcap /usr/bin/newuidmap /usr/bin/newuidmap = cap_setuid+ep

If this is not set then this will not work.

juansuerogit commented 5 years ago

Is there a Podman-Compose? How do i run the same container/container images iterated over in Dev with Podman and Buildah with a deployment to Amazon ECS, Azure AKS or IBM IKS?

giuseppe commented 5 years ago

@juansuerogit you can use podman generate kube and podman play kube

juansuerogit commented 5 years ago

So my friends on Mac are dead in the water. I told them to get a Red Hat workstation they wouldnt listen :( :(

llchan commented 5 years ago

Okay, I've confirmed that the owner of newuidmap must be root for it to work. I'll pass that along to our package deployment people to see if we can find a workaround.

@rhatdan somehow getcap returns nothing even on the host that's working :thinking:

Btw @juansuerogit sorry for sort of hijacking your thread, I thought it was related but maybe not :)

rhatdan commented 5 years ago

@llchan Then it is probably setuid.

@juansuerogit I had a nice demo today of podman running on a MAC using podman-remote. Should soon have packages available on Brew.

giuseppe commented 5 years ago

@juansuerogit the discussion derailed a bit from the original issue you were having. Have you had any chance to try again? Keep in mind the uidmap for rootless works in the default user namespace we create, so you'll need to pick IDs in the range [0-number of IDs available], this for example should work: podman run -dt --uidmap 0:100:500 ubuntu sleep 1000

baude commented 5 years ago

@juansuerogit ?

rhatdan commented 5 years ago

@juansuerogit reopen if you do not agree with the fix.

clueo8 commented 5 years ago

I am getting something similar, I have set both /etc/subuid and /etc/subgid:

ERRO[0005] Error while applying layer: ApplyLayer exit status 1 stdout: stderr: there might not be enough IDs available in the namespace (requested 0:42 for /etc/gshadow): lchown /etc/gshadow: invalid argument ApplyLayer exit status 1 stdout: stderr: there might not be enough IDs available in the namespace (requested 0:42 for /etc/gshadow): lchown /etc/gshadow: invalid argument

$ podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.12.8
  podman version: 1.5.1
host:
  BuildahVersion: 1.10.1
  Conmon:
    package: Unknown
    path: /usr/bin/conmon
    version: 'conmon version 2.0.0, commit: e217fdff82e0b1a6184a28c43043a4065083407f'
  Distribution:
    distribution: arch
    version: unknown
  MemFree: 1130614784
  MemTotal: 16690532352
  OCIRuntime:
    package: Unknown
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc8
      commit: 425e105d5a03fabd737a126ad93d62a9eeede87f
      spec: 1.0.1-dev
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 4
  eventlogger: journald
  hostname: archlinux
  kernel: 4.19.67-1-lts
  os: linux
  rootless: true
  uptime: 99h 5m 43.33s (Approximately 4.12 days)
registries:
  blocked: null
  insecure: null
  search:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  ConfigFile: /home/$user/.config/containers/storage.conf
  ContainerStore:
    number: 0
  GraphDriverName: vfs
  GraphOptions: null
  GraphRoot: /home/$user/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 0
  RunRoot: /run/user/1000
  VolumePath: /home/$user/.local/share/containers/storage/volumes

$ grep $user /etc/sub*
/etc/subgid:$user:1000000:65536
/etc/subuid:$user:1000000:65536

$ cat /etc/sysctl.d/userns.conf
kernel.unprivileged_userns_clone=1

vrothberg commented 5 years ago

@rhatdan @giuseppe can you have a look?

rhatdan commented 5 years ago

What does podman unshare cat /proc/self/uid_map Show?

We should add the user namespace information to podman info to grab this information automatically.

clueo8 commented 5 years ago

Opened #3890

rusty-eagle commented 5 years ago

I had a similar problem on openSUSE Tumbleweed. I had run the following commands:

zypper in podman echo "jon:100000:65536" >> /etc/subuid echo "jon:100000:65536" >> /etc/subgid

But I was still getting errors like this:

Error: error creating libpod runtime: there might not be enough IDs available in the namespace (requested 100000:100000 for /home/jon/.local/share/containers/storage/overlay/l): chown /home/jon/.local/share/containers/storage/overlay/l: invalid argument

And this:

Error processing tar file(exit status 1): there might not be enough IDs available in the namespace (requested 0:42 for /etc/shadow): lchown /etc/shadow: invalid argument

I could see that newuidmap and newgidmap are installed. The fix for me was this command:

podman system migrate

system info

Distributor ID: openSUSE Description: openSUSE Tumbleweed Release: 20190824

5.2.9-1-default #1 SMP Fri Aug 16 20:25:11 UTC 2019 (80c0ffe) x86_64 x86_64 x86_64 GNU/Linux

vrothberg commented 5 years ago

Thanks for the info, @rusty-eagle! system migrate is usually needed when upgrading to a new version of Podman. I assume that the package missed adding that to the post-install section (Cc @marcov @flavio @sysrich). @lsm5, @mheon, how's Fedora tackling migration? I couldn't find references in the spec files.

marcov commented 5 years ago

@vrothberg , that's something defintely missing in the "post-install" section. A couple of points:

is it OK doing that at every package upgrade?
you may consider adding a hint for the podman system migrate command for libpod runtime failures like those.

vrothberg commented 5 years ago

is it OK doing that at every package upgrade?

@mheon and @lsm5 will know.

you may consider adding a hint for the podman system migrate command for libpod runtime failures like those.

We definitely need to improve the error messages, especially for cases where we know how it can be resolved.

marcov commented 5 years ago

@rusty-eagle: I'd like to know what's the set of commands you run on Tumbleweed to get:

Error processing tar file(exit status 1): there might not be enough IDs available in the namespace (requested 0:42 for /etc/shadow): lchown /etc/shadow: invalid argument

and @vrothberg: do you know why podman system migrate fixed that?

vrothberg commented 5 years ago

do you know why podman system migrate fixed that?

Not really, no. migrate is currently only focusing on locks and I don't see how that could impact unser-namespace issues.

mheon commented 5 years ago

Migrate is actually unrelated to locks. It does configuration rewrites as necessary to fix database issues from containers created with older versions of Podman. As of now, that only includes one single issue.

However, it does have the side effect of forcing a full restart of all containers including the rootless init process (for safety). That is probably what is fixing the issue.

On Thu, Aug 29, 2019, 05:29 Valentin Rothberg notifications@github.com wrote:

do you know why podman system migrate fixed that?

Not really, no. migrate is currently only focusing on locks and I don't see how that could impact unser-namespace issues.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/containers/libpod/issues/3421?email_source=notifications&email_token=AB3AOCARB3VRSZYHGN6KB5TQG6JHBA5CNFSM4H3CRJC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5N3UZQ#issuecomment-526105190, or mute the thread https://github.com/notifications/unsubscribe-auth/AB3AOCBZE23IXSYMQJPERVTQG6JHBANCNFSM4H3CRJCQ .

ankon commented 4 years ago

For the record and future googlers:

I had the same issue (there might not be enough IDs available in the namespace (requested 0:42 for /etc/shadow): lchown /etc/shadow: invalid argument). In my case I had /etc/subuid configured for my user (echo ${LOGNAME}:100000:65536 > /etc/subuid), but had failed to do the same for /etc/subgid. A warning pointing to /etc/subgid was shown on podman build. The problem persisted after that though, and doing podman unshare cat /proc/self/uid_map showed:

$ podman unshare cat /proc/self/uid_map
         0       1000          1

Unfortunately I couldn't find what it should show though, so in a moment of desparation I also executed podman system migrate. That didn't say anything, but afterwards things started to work!

$ podman unshare cat /proc/self/uid_map
         0       1000          1
         1     100000      65536

So, if you can:

Please add a pointer to to this somewhere in the documentation, including the expected outputs
Please make podman system migrate say something

djmattyg007 commented 4 years ago

I had the same experience as @ankon on a fresh install on Arch Linux. I'd configured /etc/subuid and /etc/subgid appropriately, but it simply did not work until I ran podman system migrate. I had the same output for podman unshare cat /proc/self/uid_map, and after running the migrate command it magically started working.

bolind commented 4 years ago

I got similar errors, even with correctly configured /etc/subuid and /etc/subgid. Turns out, there's a known issue/bug when your home directory is on NFS. Try something like:

mkdir /tmp/foo && podman --root=/tmp/foo --runroot=/tmp/foo run alpine uname -a

rhatdan commented 4 years ago

NFS homedirs are covered in the troubleshooting guide.

containers / podman

error creating libpod runtime: there might not be enough IDs available in the namespace #3421

system info