containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
24.03k stars 2.43k forks source link

Intermittent build error when copying files from another image: permission denied #18093

Open akanix42 opened 1 year ago

akanix42 commented 1 year ago

Issue Description

I have various statements in a multistage build Dockerfile to copy multiple files in from other images using rsync to merge them with existing subdirectories, here's an example:

RUN --mount=type=bind,target=/tmp/mysql,from=mysql-builder,source=/tmp/mysql/install \
  --mount=type=cache,target=/var/cache/apk \
  apk add --update rsync \
  && rsync -avh /tmp/mysql/ /\
  && apk del rsync

This worked great with docker, but with podman these statements fail most of the time with the following error:

[16/18] STEP 4/18: RUN --mount=type=bind,target=/tmp/mysql,from=mysql-builder,source=/tmp/mysql/install   --mount=type=cache,target=/var/cache/apk   apk add --update rsync   && rsync -avh /tmp/mysql/ /  && apk del rsync
sending incremental file list
rsync: [sender] change_dir "/tmp/mysql" failed: Permission denied (13)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1336) [sender=3.2.7]

This has been plaguing me for weeks now and I just got the idea to try building with --jobs 1 instead of --jobs 0, which seems to have resolved it for now, but of course I'd prefer to have parallel builds. Without that, I simply have to keep running the job in a loop until it eventually gets lucky and succeeds, so it seems to be some sort of race condition.

Steps to reproduce the issue

I haven't yet had the time to create a simple reproducible error, but I wanted to report the bug anyway in case there was something that could be done about it before I have the time to do so. My guess is that it has to do with having multiple RUN --mount commands running at the same time.

Describe the results you received

Intermittent failed build to a permission issue.

Describe the results you expected

Successful build, no errors.

podman info output

host:
  arch: arm64
  buildahVersion: 1.29.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.7-2.fc37.aarch64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 52.01
    systemPercent: 20.29
    userPercent: 27.7
  cpus: 1
  distribution:
    distribution: fedora
    variant: coreos
    version: "37"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 502
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 6.2.8-200.fc37.aarch64
  linkmode: dynamic
  logDriver: journald
  memFree: 39862272
  memTotal: 2049007616
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.8.3-2.fc37.aarch64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.3
      commit: 59f2beb7efb0d35611d5818fd0311883676f6f7e
      rundir: /run/user/502/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/502/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-8.fc37.aarch64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 0h 23m 31.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 106769133568
  graphRootUsed: 42972188672
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1163
  runRoot: /run/user/502/containers
  transientStore: false
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 4.4.2
  Built: 1677669759
  BuiltTime: Wed Mar  1 05:22:39 2023
  GitCommit: ""
  GoVersion: go1.19.6
  Os: linux
  OsArch: linux/arm64
  Version: 4.4.2

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

Running on an M1 macbook with MacOS 13.2.1.

Additional information

The issue happens the majority of the time but eventually succeeds.

github-actions[bot] commented 1 year ago

A friendly reminder that this issue had no activity for 30 days.

vrothberg commented 1 year ago

Thanks for reaching out, @akanix42! And sorry for the silence. This issue must have fallen under the radar.

@flouthoc can you take a look?

flouthoc commented 1 year ago

Seems like a race, let me take a look. Thanks.

flouthoc commented 1 year ago

@akanix42 This happens in a single multi-stage build or which multiple builds running in parallel ?

akanix42 commented 1 year ago

@flouthoc It's a single multi-stage build; there are no other builds running concurrently