Closed mlbiam closed 5 years ago
@giuseppe PTAL, I believe this is related to my the buildah in a container blog issue that I ran into over the holiday break.
This particular error is around the /etc/subuid and /etc/subgid range not being fully populated. Which is going to cause an issue. Centos7 does not have a new enough version of shadow-utils to make this work. RHEL7.7 will have the update in the summer.
I really think we need to fully document different ways of doing this, with an extensive blog. Perhaps @TomSweeneyRedHat you and I can work on this next week.
@mlbiam Lets examine the use case you are trying to emulate.
First would you consider doing this without docker, but with Podman?
Could we attempt to do this with "--isolation chroot".
we already had a similar issue with Podman:
https://github.com/containers/libpod/issues/1092
if you are controlling the Docker container though, you can tweak its capabilities and add CAP_SETUID/CAP_SETGID, that should be enough to configure the user namespace with a new enough version of shadow-utils. Older versions required CAP_SYS_ADMIN.
An image based on centos7 won't have shadow-utils. Is it possible to use Fedora?
Otherwise, when you are in a container, you don't probably need to create another container inside, and --isolation chroot
should be enough.
My use case is to automate the creation of my RHEL s2i builder image via the same process I automate my other builds. Right now we use the Jenkins that comes with OKD 3.11 and the maven agents for Java with Google's JIB to generate our non centos/rhel images and just a webhook call to dockerhub for our centos s2i image. So my thought was to use buildah from inside an agent to give us the same type of capability.
It turns out i was making my life more difficult then I needed to because I didn't realize I could create a BuildConfig
that uses a Docker strategy rather then an s2i strategy. That said I think this would be a great use case for "upstream" k8s implementations.
@mlbiam could this issue be closed?
I'm seeing a similar issue running buildah
1.7.1 rootless on other distros that don't have the various subuid
/subgid
etc.
ERRO[0000] error reading allowed ID mappings: error reading subuid mappings for user "jugs" and subgid mappings for group "jugs": open /etc/subuid: no such file or directory
What's the proposed solution @giuseppe?
RHEL7 and Centos7 Will have the appropriate shadow-utils in RHEL7.7 release. I think @vbatts has an updated shadow-utils package somewhere for testing.
What's the proposed solution @giuseppe?
there is no solution to that. We need an updated shadow-utils that allow to configure a user namespace with multiple IDs
Until Shadow is rebased for rhel and centos, you can test with https://copr.fedorainfracloud.org/coprs/vbatts/shadow-utils-newxidmap/
Since so many people are trying this, we need a blog describing how to do it, and then need a page in github explaining it.
A guide on this would be extremely useful.
Hi, I also tried to do run buildah inside an unprivileged container without success so far. A guide would be useful indeed. Fwiw here are my notes:
On a fresh f29 cloud instance:
[fedora@unpriv-buildah ~]$ uname -a
Linux unpriv-buildah.rdocloud 4.20.14-200.fc29.x86_64 #1 SMP Tue Mar 5 19:55:32 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
[fedora@unpriv-buildah ~]$ buildah version
Version: 1.7
Go Version: go1.11.5
Image Spec: 1.0.0
Runtime Spec: 1.0.0
CNI Spec: 0.4.0
libcni Version:
Git Commit:
Built: Thu Jan 1 00:00:00 1970
OS/Arch: linux/amd64
[fedora@unpriv-buildah ~]$ podman version
Version: 1.1.2
RemoteAPI Version: 1
Go Version: go1.11.5
Git Commit: a95a49d3038462d033f84ac314ec8a3064a99cff
Built: Tue Mar 5 18:10:31 2019
OS/Arch: linux/amd64
Using this image built with buildah bud -t builder .
:
FROM fedora:rawhide
RUN dnf -y install buildah
Result in:
[fedora@unpriv-buildah ~]$ podman run -it builder
[root@27b68ab596e7 /]# buildah from centos
ERRO[0000] 'overlay' is not supported over <unknown> at "/var/lib/containers/storage/overlay"
kernel does not support overlay fs: 'overlay' is not supported over <unknown> at "/var/lib/containers/storage/overlay": backing file system is unsupported for this graph driver
ERRO[0000] exit status 1
After setting the default storage.conf to use fuse-overlayfs:
[root@27b68ab596e7 /]# sed -e 's/#mount_program/mount_program/' -i /etc/containers/storage.conf
[root@27b68ab596e7 /]# buildah from centos
permission denied
ERRO[0000] exit status 1
Ptracing buildah process from outside the container shows that it tries to:
[pid 5998] mount("/var/lib/containers/storage/overlay", "/var/lib/containers/storage/overlay", 0xc00014d398, MS_BIND, NULL <unfinished ...>
[pid 5999] <... nanosleep resumed> NULL) = 0
[pid 5998] <... mount resumed> ) = -1 EACCES (Permission denied)
Disabling SELinux makes the "buildah from" action succeed, however it fails to run:
[root@27b68ab596e7 /]# buildah from centos
Getting image source signatures
Copying blob a02a4930cb5d [======================================] 71.7MiB / 71.7MiB
Copying config 1e1148e4cc [======================================] 2.1KiB / 2.1KiB
Writing manifest to image destination
Storing signatures
centos-working-container
[root@27b68ab596e7 /]# buildah run centos-working-container yum install -y httpd
ERRO[0000] error unmounting /var/lib/containers/storage/overlay/e7fdbc6d613d8861e064ca8114ec934197f41c35b7f6448d2eb824c8f19af09d/merged: invalid argument
error mounting container "2e509e80e516a5fe852e09c6d6ba8a273aa90b250d742b3670cbc15c41baabe9": error mounting build container "2e509e80e516a5fe852e09c6d6ba8a273aa90b250d742b3670cbc15c41baabe9": error creating overlay mount to /var/lib/containers/storage/overlay/e7fdbc6d613d8861e064ca8114ec934197f41c35b7f6448d2eb824c8f19af09d/merged: exit status 1
ERRO[0000] exit status 1
[root@27b68ab596e7 /]# buildah delete centos-working-container
2e509e80e516a5fe852e09c6d6ba8a273aa90b250d742b3670cbc15c41baabe9
Using BUILDAH_ISOLATION=chroot result in the same errors.
Adding --device /dev/fuse doesn't work either:
[fedora@unpriv-buildah ~]$ podman run --device /dev/fuse -it builder
[root@d0c7996d84e8 /]# ls -l /dev
ls: cannot access '/dev/fuse': Permission denied
total 0
crw--w----. 1 root tty 136, 0 Mar 14 04:56 console
lrwxrwxrwx. 1 root root 11 Mar 14 04:55 core -> /proc/kcore
lrwxrwxrwx. 1 root root 13 Mar 14 04:55 fd -> /proc/self/fd
crw-rw-rw-. 1 nobody nobody 1, 7 Mar 14 04:37 full
-?????????? ? ? ? ? ? fuse
Or using --volume /dev/fuse:/dev/fuse still results in a "error mounting container".
I also tried to use a non root image built with buildah bud -t unpriv-builder .
:
FROM builder
RUN useradd -m fedora
USER fedora
Which resulted in (with and without SELinux enforcing):
[fedora@unpriv-buildah ~]$ podman run -it unpriv-builder
[fedora@692ecf8aa01f /]$ buildah from centos
Error: error running newgidmap: exit status 1: newgidmap: write to gid_map failed: Operation not permitted
ERRO[0000] exit status 1
Ptracing buildah shows:
...
[pid 13101] execve("/usr/bin/newgidmap", ["newgidmap", "40", "0", "1000", "1", "1", "100000", "65536"], 0xc000532d80 /* 15 vars */ <unfinished ...>
...
[pid 13101] openat(3</proc/40>, "gid_map", O_WRONLY) = 5</proc/40/gid_map>
[pid 13101] write(5</proc/40/gid_map>, "0 1000 1\n1 100000 65536\n", 24) = -1 EPERM (Operation not permitted)
Which seems to be expected according to user_namespace(7): "The data written to uid_map (gid_map) must consist of a single line"
Adding --cap-add SETUID --cap-add SETGID and Updating the image to do: RUN chmod 4755 /usr/bin/newuidmap /usr/bin/newgidmap Doesn't fix the "newgidmap: write to gid_map failed" error.
Then using latest master version on both the host and the image resulted in the same error:
[fedora@unpriv-buildah bin]$ ./buildah version
Version: 1.8-dev
Go Version: go1.11.5
Image Spec: 1.0.0
Runtime Spec: 1.0.0
CNI Spec: 0.4.0
libcni Version: v0.7.0-alpha1
Git Commit: 3b497ff1
Built: Thu Mar 14 05:18:16 2019
OS/Arch: linux/amd64
[fedora@unpriv-buildah bin]$ ./podman version
Version: 1.2.0-dev
RemoteAPI Version: 1
Go Version: go1.11.5
Git Commit: 7426d4fbbeaf5ebd3d55576add89b99cd3f3f760
Built: Thu Mar 14 05:19:30 2019
OS/Arch: linux/amd64
FWIW, I'll try to get you some notes together @TristanCacqueray which will get you closer, but unfortunately not to the finish line. @rhatdan and I looked into this a bit this morning, more work to come as we'd trouble with the volumes too that we couldn't get around.
I've an in progress blog in the works for showing how it works while running as root, once we figure out the rootles end, we'll publish one for that too.
Well I think it works well except when launched in a User Namespace.
On Fedora 29 I am able to run buildah as unprivileged user in a container that is started as root and with privileged flag, but having the id flag:
sudo podman run -it --privileged --user 1000 buildahimage
This will the allow me to run buildah within the image as an unprivileged user.
While this works on F29 it's not working on EL7 with latest buildah/podman & @vbatts uid rpms.
I think in RHEL7 you have to turn on the User Namespace via a sysctl as well. If you have done this, what other issues are you seeing?
You can now run quay.io/buildah/stable within a locked down contianer.
podman run --device /dev/fuse ...
@rhatdan I must be missing something (or more likely have a knowledge gap). I can't seem to get the buildah image you mentioned to work as I expected.
Some context: I have podman working for my CICD use-case via Jenkins and K8s. I am mounting a host path to /var/lib/containers
in the container and setting securityContext.privileged
to true
as many have suggested in past GH issues. That is all great however I am looking to switch to unprivileged CICD builds due to security policies where I work.
The output below is from within the buildah image. For reference I ran kubectl exec -it buildah -- /bin/bash
to enter the container. Lastly, I noticed that buildah version
is 1.8.2
which doesn't match the image tag. Thanks for your help!
[root@buildah ~]# ls
anaconda-ks.cfg anaconda-post.log Dockerfile original-ks.cfg
[root@buildah ~]# cat Dockerfile
FROM alpine
RUN date
[root@buildah ~]# buildah version
Version: 1.8.2
Go Version: go1.11.7
Image Spec: 1.0.0
Runtime Spec: 1.0.0
CNI Spec: 0.4.0
libcni Version:
Git Commit:
Built: Thu Jan 1 00:00:00 1970
OS/Arch: linux/amd64
[root@buildah ~]# buildah bud -t testing:latest .
permission denied
ERRO[0000] exit status 1
Documents/Projects/cluster1
➜ cat buildah.yaml
apiVersion: v1
kind: Pod
metadata:
name: buildah
namespace: default
spec:
containers:
- name: podman
image: quay.io/buildah/stable:v1.8.3
command:
- cat
tty: true
You have to add the /dev/fuse device to make buildah run with fuse-overlayfs.
Is there a way in the Yaml file to add a device? If not you could add the device via cri-o.conf on the host.
@rhatdan I should have been more clear, one of the security policies in place on the clusters at work prevent the use of hostPath
volumes. Even if I can have that policy modified for a particular namespace or cluster I am unsure if /dev/fuse
will exist on the hosts/nodes we use for k8s (present and future). I'm assuming the vfs storage driver isn't worth my time?
Is there a way in the Yaml file to add a device?
example buildah.yaml
with hostPath for fuse-overlayfs
:
apiVersion: v1
kind: Pod
metadata:
name: buildah
namespace: default
spec:
containers:
- name: podman
image: quay.io/buildah/stable:v1.8.3
command:
- cat
tty: true
volumeMounts:
- name: fuse
mountPath: /dev/fuse
volumes:
- name: fuse
hostPath:
path: /dev/fuse
If not you could add the device via cri-o.conf on the host.
I'm not a cluster admin, so I don't have access to make changes via the kube-apiserver or directly on the hosts.
I would prefer not to volume mount in /dev/fuse, but actually inject it into the container, similar to a
podman run --device=/dev/fuse
Volume mounting it in, might work, but SELinux policy might prevent the use of the device.
@rhatdan I looked into this a bit more and I am still confused. While looking over the docs for buildahimage I noticed that in addition to the --device
argument that you mentioned, the sample usage also shows a volume being mounted (-v /var/lib/mycontainer
) and a security setting being tweaked (seccomp=unconfined
). Both of these options tend to be considered no good when running containers on k8s.
While I don't expect an explanation that will resolve my confusion (you have already been very helpful in answering my dumb questions), I am curious what examples you were referring to here: https://github.com/containers/buildah/issues/518#issuecomment-410058670
Thanks.
I don't believe the seccomp=unconfined is required any longer, we have fixed up the seccomp rules and buildah to allow buildah to run within a contianer.
The volume mount of /var/lib/containers is all about overalyfs on top of overlayfs not being supported in the kernel. So running podman/crio on an overlayfs graphdriver, means that buildah can not use overlayfs on /var/lib/containers within the container, unless it is volume mounted in from an NON overlerlayfs file system.
BTW I did find a bug in buildah that is causing increase SELinux privs.
Has anyone been able to run Buildah as unprivileged on Kubernetes/OpenShift yet? I tried the quay.io/buildah/stable
image with the following deployment where I even granted it all of the SELinux capabilities:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: buildah
labels:
app: buildah
spec:
replicas: 1
template:
metadata:
labels:
app: buildah
spec:
volumes:
- name: volume
containers:
- name: buildahtest
image: quay.io/buildah/stable
command: ["tail", "-f", "/dev/null"]
imagePullPolicy: "Always"
securityContext:
capabilities:
add:
- ALL
volumeMounts:
- mountPath: /var/lib/containers
name: volume
But I get
[root@buildah-release-7b9d7fc877-8z2vv goproj]# buildah bud -t goproj .
ERRO[0000] 'overlay' is not supported over xfs at "/var/lib/containers/storage/overlay"
kernel does not support overlay fs: 'overlay' is not supported over xfs at "/var/lib/containers/storage/overlay": backing file system is unsupported for this graph driver
I made progress with --storage-driver vfs
, but it failed on a later step:
[root@buildah-release-7b9d7fc877-8z2vv goproj]# buildah bud -t goproj --storage-driver vfs .
STEP 1: FROM golang:latest
Getting image source signatures
Copying blob 6f2f362378c5 done
Copying blob 0658c6765517 done
Copying blob 372744b62d49 done
Copying blob fc2529ce2b56 done
Copying blob 494c27a8a6b8 done
Copying blob 7596bb83081b done
Copying blob c9a1ca7e4a49 done
Copying config 9fe4cdc1f1 done
Writing manifest to image destination
Storing signatures
STEP 2: RUN mkdir /app
container_linux.go:336: starting container process caused "process_linux.go:293: applying cgroup configuration for process caused \"mkdir /sys/fs/cgroup/cpuset/buildah-buildah405416464: read-only file system\""
error running container: error creating container for [/bin/sh -c mkdir /app]: : exit status 1
error building at STEP "RUN mkdir /app": error while running runtime: exit status 1
The other issue it seems is that there's no equivalent of the --devices
flag in Kube currently, so the /dev/fuse
devise cannot be passed in, like was suggested.
I should note that if I set privileged: true
in the deployment, the errors go away and buildah works fine, but that's exactly what I'm trying to avoid.
Hi all,
I saw this issue is closed without any solution.
I also have this problem now. Is there any solution or workaround to solve it?
Thanks a lot!
@zhangtbj does https://github.com/containers/buildah/issues/1335#issuecomment-463627933 work for you?
Hi @giuseppe ,
No, it doesn't work.... :(
I remove the privileged
in my pod security context and add --isolation=chroot
for buildaha bud command.
But still report an error:
{"level":"info","ts":1585649915.6939785,"caller":"git/git.go:133","msg":"Successfully initialized and updated submodules in path /workspace/source"}
Error during unshare(CLONE_NEWUSER): Operation not permitted
level=error msg="error parsing PID \"\": strconv.Atoi: parsing \"\": invalid syntax"
level=error msg="(unable to determine exit status)"
2020/03/31 10:18:37 Skipping step because a previous step failed
2020/03/31 10:18:38 Skipping step because a previous step failed
{"level":"info","ts":1585649901.456626,"caller":"creds-init/main.go:44","msg":"Credentials initialized."}
I also saw this unshare(CLONE_NEWUSER): Operation not permitted
error from google. But I didn't see any solution for it.
Do you have any idea or workaround for it?
Thanks a lot!
The unsahre is being blocked by seccomp.json. Are you launching from Docker or Podman? Docker seccomp.json blocs some syscalls, that we allow by default.
You could try with docker run --security-opt seccomp=/usr/share/containers/seccomp.json ... Or just use podman...
Hi @rhatdan ,
I am running on the Kubernetes, not the single Docker or Podman. And I cannot config the seccomp
by ourselves.... :(
Well unless you customize the yaml or control the environment, there is not much we can do.
I have similar issues on OpenShift 4.3 where I want to run Buildah in an unprivileged container (directly, not via S2I). I mounted /var/lib/containers
on an emptyDir
volume and then called
buildah bud --storage-driver vfs --isolation chroot --format=oci --tls-verify=true --layers -f Dockerfile -t "index.docker.io/$DOCKER_USER/buildah-test" .
within a Pod using quay.io/buildah/stable:v1.11.0
as image. I also set $HOME
to a writable directory below /tmp
but then got this error:
ERRO Error while applying layer: ApplyLayer exit status 1 stdout: stderr: there might not be enough IDs available in the namespace (requested 0:42 for /etc/shadow): lchown /etc/shadow: invalid argument
When playing around with --userns-uid-map
and --userns-gid-map
it turns into this error:
STEP 1: FROM alpine
Getting image source signatures
Copying blob aad63a933944 done
Copying config a187dde48c done
Writing manifest to image destination
Storing signatures
ERRO Error while applying layer: ApplyLayer exit status 1 stdout: stderr: Container ID 0 cannot be mapped to a host ID
Is there any way to run buildah within a Pod on OpenShift 4.3 under an arbitrary non-root UID ?
I followed https://docs.openshift.com/container-platform/4.3/builds/custom-builds-buildah.html but this documentaton does not tell us how to use buildah without a custom S2I buildconfig.
Is there any way to run buildah within a Pod on OpenShift 4.3 under an arbitrary non-root UID ?
you need to add CAP_SETUID and CAP_SETGID to the container.
And also make sure newuidmap/newgidmap
work inside of the container, otherwise you can use only one user mapped inside the new user namespace that buildah creates.
you need to add CAP_SETUID and CAP_SETGID to the container.
I guess this is nothing a regular (non-admin) OpenShift user can do. But maybe a subset of instructions could work ootb without having to change the setup ? (e.g. everything except RUN
, no COPY --chown
... etc)
Any package install is going to require privilege if that package contains more then 1 UID.
RUN dnf -y install foobar
Hi,
I added CAP_SETUID
and CAP_SETGID
in the container, without privileged permission and run as root. But still report error that cannot execute the build .... :(
I documented the error and my steps in the issue: https://github.com/containers/buildah/issues/2262
Any package install is going to require privilege if that package contains more then 1 UID.
RUN dnf -y install foobar
I see, but what about builds that do not require any other UID than the current one ? I think about building applications on top of a base image that already includes all tools (like npm for node or Maven for Java). Could this work ootb ?
It's a severe restriction, true, but for S2I like system that would be very useful.
If the image has more then one UID, No. When the image is pulled by Buildah within the container, it will attempt to create UIDs other then "root", which without a user namespace will not be allowed.
Description
Trying to create a jenkins-agent using buildah to create containers in an unprivileged container. When I run
buildah bud -t local/ous2i .
I get the error:ERRO[0000] error reading allowed ID mappings: error reading subuid mappings for user "default" and subgid mappings for group "default": No subuid ranges found for user "default" in /etc/subuid
. Here's by container's Dockerfile:Steps to reproduce the issue:
docker build --no-cache --tag local/jb .
docker run -ti --name jb local/jb bash
git clone https://github.com/TremoloSecurity/OpenUnisonS2IDocker.git
cd OpenUnisonS2IDocker/
buildah bud -t local/ous2i .
Describe the results you received:
ERRO[0000] error reading allowed ID mappings: error reading subuid mappings for user "default" and subgid mappings for group "default": No subuid ranges found for user "default" in /etc/subuid
Describe the results you expected:
a built container image
Output of
rpm -q buildah
orapt list buildah
:Output of
buildah version
:Output of
podman version
if reporting apodman build
issue:*Output of `cat /etc/release`:**
Output of
uname -a
:Output of
cat /etc/containers/storage.conf
: