containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.77k stars 2.42k forks source link

"error: Error -1 running transaction" while running Dockerfile as unprivileged user #9733

Closed ffromani closed 3 years ago

ffromani commented 3 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

I routinely build container images for my operator (https://github.com/openshift-kni/performance-addon-operators/) as unprivileged user on my fedora 33 (stock podman/containers packages). One of these containers (must-gather) wants to install RPMs into a ubi (https://www.redhat.com/en/blog/introducing-red-hat-universal-base-image) base image. Lately the process fails with

podman build --no-cache -f openshift-ci/Dockerfile.must-gather -t """quay.io"/"openshift-kni"/"performance-addon-operator-must-gather":"4.8-snapshot""" --build-arg BIN_DIR="build/_output/bin/" .
STEP 1: FROM quay.io/openshift/origin-must-gather:4.7.0 AS builder
--> e9ca204d2bb
STEP 2: FROM registry.access.redhat.com/ubi8/ubi-minimal:latest
STEP 3: RUN microdnf install -y shadow-utils

(microdnf:1): librhsm-WARNING **: 08:13:36.435: Found 0 entitlement certificates

(microdnf:1): librhsm-WARNING **: 08:13:36.436: Found 0 entitlement certificates
Downloading metadata...
Downloading metadata...
Downloading metadata...
Package                           Repository       Size
Installing:                                            
 libsemanage-2.9-3.el8.x86_64     ubi-8-baseos 168.6 kB
 shadow-utils-2:4.6-11.el8.x86_64 ubi-8-baseos   1.3 MB
Transaction Summary:
 Installing:        2 packages
 Reinstalling:      0 packages
 Upgrading:         0 packages
 Removing:          0 packages
 Downgrading:       0 packages
Downloading packages...
Running transaction test...
Installing: libsemanage;2.9-3.el8;x86_64;ubi-8-baseos
Installing: shadow-utils;2:4.6-11.el8;x86_64;ubi-8-baseos
error: Error -1 running transaction
Error: error building at STEP "RUN microdnf install -y shadow-utils": error while running runtime: exit status 1

Can't pinpoint the failure to a specific update. No changes into the Dockerfile:

FROM quay.io/openshift/origin-must-gather:4.7.0 AS builder

FROM registry.access.redhat.com/ubi8/ubi-minimal:latest
RUN microdnf install -y shadow-utils
RUN microdnf install -y pciutils util-linux hostname rsync tar

# Copy must-gather required binaries
COPY --from=builder /usr/bin/openshift-must-gather /usr/bin/openshift-must-gather
COPY --from=builder /usr/bin/oc /usr/bin/oc

# Save original gather script
COPY --from=builder /usr/bin/gather* /usr/bin/
RUN mv /usr/bin/gather /usr/bin/gather_original

ARG BIN_DIR=
ARG COLLECTION_SCRIPTS_DIR=must-gather/collection-scripts
ARG NODE_GATHER_MANIFESTS_DIR=must-gather/node-gather

COPY ${COLLECTION_SCRIPTS_DIR}/* /usr/bin/
COPY ${NODE_GATHER_MANIFESTS_DIR} /etc/node-gather
# rename to be consistent with all other must-gather helper
COPY ${BIN_DIR}gather-sysinfo /usr/bin/gather_sysinfo

ENTRYPOINT /usr/bin/gather

No changes into the user config

Steps to reproduce the issue:

  1. Run the following on a up-to-date fedora 33 with all the container stack packages from official repos
  2. Fetch a copy of https://github.com/openshift-kni/performance-addon-operators
  3. [OPTIONALLY] checkout this PR to further narrow down the problem https://github.com/openshift-kni/performance-addon-operators/pull/582
  4. run as unprivileged user make && make must-gather-container

Describe the results you received: container build fails as shown below

Describe the results you expected: container build succeds, or at very least the error message helps me understand the actual issue?

Additional information you deem important (e.g. issue happens only occasionally):

  1. disabling selinux (setenforce 0) has NO effect (build still fails)
  2. running as root works as expected!
  3. If I do podman run -ti registry.access.redhat.com/ubi8/ubi-minimal:latest -- /bin/bash and then microdnf install -y shadow-utils everything works as expected

Output of podman version:

host:
  arch: amd64
  buildahVersion: 1.19.4
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.26-1.fc33.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.26, commit: 777074ecdb5e883b9bec233f3630c5e7fa37d521'
  cpus: 4
  distribution:
    distribution: fedora
    version: "33"
  eventLogger: journald
  hostname: musashi2.rokugan.lan
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.10.22-200.fc33.x86_64
  linkmode: dynamic
  memFree: 3838980096
  memTotal: 16649846784
  ociRuntime:
    name: crun
    package: crun-0.18-1.fc33.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.18
      commit: 808420efe3dc2b44d6db9f1a3fac8361dde42a95
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    selinuxEnabled: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.9-1.fc33.x86_64
    version: |-
      slirp4netns version 1.1.9
      commit: 4e37ea557562e0d7a64dc636eff156f64927335e
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 21473779712
  swapTotal: 21474828288
  uptime: 2h 1m 58.31s (Approximately 0.08 days)
registries:
  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888:
    Blocked: false
    Insecure: true
    Location: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888
  musashi2.rokugan.lan:5000:
    Blocked: false
    Insecure: true
    Location: musashi2.rokugan.lan:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: musashi2.rokugan.lan:5000
  search:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  configFile: /home/fromani/.config/containers/storage.conf
  containerStore:
    number: 5
    paused: 0
    running: 0
    stopped: 5
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.4.0-1.fc33.x86_64
      Version: |-
        fusermount3 version: 3.9.3
        fuse-overlayfs: version 1.4
        FUSE library version 3.9.3
        using FUSE kernel interface version 7.31
  graphRoot: /home/fromani/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 41
  runRoot: /run/user/1000
  volumePath: /home/fromani/.local/share/containers/storage/volumes
version:
  APIVersion: 3.0.0
  Built: 1613753777
  BuiltTime: Fri Feb 19 17:56:17 2021
  GitCommit: ""
  GoVersion: go1.15.8
  OsArch: linux/amd64
  Version: 3.0.1

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.19.4
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.26-1.fc33.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.26, commit: 777074ecdb5e883b9bec233f3630c5e7fa37d521'
  cpus: 4
  distribution:
    distribution: fedora
    version: "33"
  eventLogger: journald
  hostname: musashi2.rokugan.lan
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.10.22-200.fc33.x86_64
  linkmode: dynamic
  memFree: 3819450368
  memTotal: 16649846784
  ociRuntime:
    name: crun
    package: crun-0.18-1.fc33.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.18
      commit: 808420efe3dc2b44d6db9f1a3fac8361dde42a95
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    selinuxEnabled: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.9-1.fc33.x86_64
    version: |-
      slirp4netns version 1.1.9
      commit: 4e37ea557562e0d7a64dc636eff156f64927335e
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 21473779712
  swapTotal: 21474828288
  uptime: 2h 2m 19.55s (Approximately 0.08 days)
registries:
  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888:
    Blocked: false
    Insecure: true
    Location: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888
  musashi2.rokugan.lan:5000:
    Blocked: false
    Insecure: true
    Location: musashi2.rokugan.lan:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: musashi2.rokugan.lan:5000
  search:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  configFile: /home/fromani/.config/containers/storage.conf
  containerStore:
    number: 5
    paused: 0
    running: 0
    stopped: 5
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.4.0-1.fc33.x86_64
      Version: |-
        fusermount3 version: 3.9.3
        fuse-overlayfs: version 1.4
        FUSE library version 3.9.3
        using FUSE kernel interface version 7.31
  graphRoot: /home/fromani/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 41
  runRoot: /run/user/1000
  volumePath: /home/fromani/.local/share/containers/storage/volumes
version:
  APIVersion: 3.0.0
  Built: 1613753777
  BuiltTime: Fri Feb 19 17:56:17 2021
  GitCommit: ""
  GoVersion: go1.15.8
  OsArch: linux/amd64
  Version: 3.0.1

Package info (e.g. output of rpm -q podman or apt list podman):

$ rpm -qa | grep -E 'podman|buildah|crun|runc|fuse'
fuse3-3.9.4-1.fc33.x86_64
fuse3-libs-3.9.4-1.fc33.x86_64
fuse-common-3.9.4-1.fc33.x86_64
fuse-devel-2.9.9-10.fc33.x86_64
fuse-2.9.9-10.fc33.x86_64
fuse-libs-2.9.9-10.fc33.x86_64
gvfs-fuse-1.46.2-1.fc33.x86_64
fuse-overlayfs-1.4.0-1.fc33.x86_64
podman-plugins-3.0.1-1.fc33.x86_64
podman-3.0.1-1.fc33.x86_64
podman-remote-3.0.1-1.fc33.x86_64
glusterfs-fuse-8.4-1.fc33.x86_64
crun-0.18-1.fc33.x86_64
buildah-1.19.6-2.fc33.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

latest version - tested the latest packaged version troubleshooting guide - yes, not sure if uidmap is related, didn't change a thing. podman system migrate didn't help

Additional environment details (AWS, VirtualBox, physical, etc.): this happens on a up to date fedora 33

giuseppe commented 3 years ago

shadow-utils use xattrs, I think it can be a kernel regression.

Could you try the reproducer here: https://github.com/containers/buildah/issues/3071#issuecomment-796070354 ?

ffromani commented 3 years ago

shadow-utils use xattrs, I think it can be a kernel regression.

Could you try the reproducer here: containers/buildah#3071 (comment) ?

$ uname -a
Linux musashi2.rokugan.lan 5.10.22-200.fc33.x86_64 #1 SMP Tue Mar 9 22:05:08 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ unshare -r unshare -r sh -c 'touch /tmp/setxattr-test; setcap "cap_setuid=ep" /tmp/setxattr-test' && echo ok
Failed to set capabilities on file `/tmp/setxattr-test' (Invalid argument)
usage: setcap [-h] [-q] [-v] [-n <rootid>] (-r|-|<caps>) <filename> [ ... (-r|-|<capsN>) <filenameN> ]

 Note <filename> must be a regular (non-symlink) file.
 -r          remove capability from file
 -           read capability text from stdin
 <capsN>     cap_from_text(3) formatted file capability

 -h          this message and exit status 0
 -q          quietly
 -v          validate supplied capability matches file
 -n <rootid> write a user namespace limited capability
 --license   display the license info

so IIUC it fails like on F34 and so it seems a regression, or at least a breaking change indeed. Not sure if relevant: both my /home and /tmp fs are ext4.

giuseppe commented 3 years ago

yes, it is a regression in the kernel and we cannot do anything about it.

The only workaround I am aware of is to specify a different mapping:

podman build --userns-uid-map 0:1:65535 --userns-gid-map 0:1:65535 ....

Does it work for you?

I am closing the issue because it is not a bug in Podman/Buildah that we can address, but we discuss the problem further here

ffromani commented 3 years ago

thanks @giuseppe , makes sense and I agree to close this issue.

TomSweeneyRedHat commented 3 years ago

@fatherlinux FYI issue running ubi image

MarSik commented 3 years ago

If this is a kernel regression, do you maybe have a link to a proper Fedora bug? I would prefer if this blocked the F34 release :)

giuseppe commented 3 years ago

If this is a kernel regression, do you maybe have a link to a proper Fedora bug? I would prefer if this blocked the F34 release :)

we only had a discussion here: https://github.com/containers/buildah/issues/3071

I am not aware of any bugzilla to track it