moby / buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
https://github.com/moby/moby/issues/34227
Apache License 2.0
8.23k stars 1.17k forks source link

buildkit rootless 0.17.0 with fuse-overlayfs giving setxattr user.overlay.impure operation not permitted errors #5478

Open chadlwilson opened 3 weeks ago

chadlwilson commented 3 weeks ago

When building via buildx our builds starting giving errors like the below after an implicit upgrade to use the moby/buildkit:v0.17.0-rootless image (previously v0.16.0-rootless).

#9 ERROR: mount callback failed on /run/user/1000/containerd-mount2444245211: failed to setxattr "/run/user/1000/containerd-mount2444245211/etc" for key "user.overlay.impure": operation not permitted

Environment is

Host OS: Linux 5.10.219-208.866.amzn2.x86_64 amd64 (Amazon Linux 2) Host Docker: 20.10.27 (yeah, I know it's EOL - long story) DIND image OS: Centos Stream 9 (if it matters) DIND image Docker: 27.3.1 DIND image Docker buildx plugin: 0.17.1

Fuller log

$ docker buildx version
github.com/docker/buildx v0.17.1 257815a
$ docker buildx create --use --name gocd-builder --driver-opt image=moby/buildkit:rootless
#1 [internal] booting buildkit
Initializing docker buildx builder [gocd-builder]...
#1 pulling image moby/buildkit:rootless
gocd-builder
#1 pulling image moby/buildkit:rootless 3.7s done
#1 creating container buildx_buildkit_gocd-builder0
Name:          gocd-builder
#1 creating container buildx_buildkit_gocd-builder0 0.5s done
Driver:        docker-container
Last Activity: 2024-11-01 02:57:53 +0000 UTC
#1 DONE 4.2s

Nodes:
Name:                  gocd-builder0
Endpoint:              unix:///var/run/docker.sock
Driver Options:        image="moby/buildkit:rootless"
Status:                running
BuildKit daemon flags: --allow-insecure-entitlement=network.host
BuildKit version:      v0.17.0
Platforms:             linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/amd64/v4, linux/386
Labels:
 org.mobyproject.buildkit.worker.executor:         oci
 org.mobyproject.buildkit.worker.hostname:         bff47ac416df
 org.mobyproject.buildkit.worker.network:          host
 org.mobyproject.buildkit.worker.oci.process-mode: sandbox
 org.mobyproject.buildkit.worker.selinux.enabled:  false
 org.mobyproject.buildkit.worker.snapshotter:      fuse-overlayfs
GC Policy rule#0:
 All:           false
 Filters:       type==source.local,type==exec.cachemount,type==source.git.checkout
 Keep Duration: 48h0m0s
GC Policy rule#1:
 All:           false
 Keep Duration: 1440h0m0s
 Keep Bytes:    4.657GiB
GC Policy rule#2:
 All:        false
 Keep Bytes: 4.657GiB
GC Policy rule#3:
 All:        true
 Keep Bytes: 4.657GiB

> Task :docker:gocd-server:wolfi-latest:docker

Building wolfi image for [x64, aarch64]. (Current build architecture is x64).

$ docker buildx build --pull --platform linux/amd64,linux/arm64 --output type=oci,dest=wolfi-latest.tar . --tag wolfi-latest

#0 building with "gocd-builder" instance using docker-container driver
#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 4.03kB done
#1 DONE 0.0s

#2 [linux/arm64 internal] load metadata for docker.io/curlimages/curl:latest
#2 ...

#3 [linux/amd64 internal] load metadata for docker.io/curlimages/curl:latest
#3 DONE 2.0s

#2 [linux/arm64 internal] load metadata for docker.io/curlimages/curl:latest
#2 DONE 2.0s

#4 [linux/arm64 internal] load metadata for cgr.dev/chainguard/wolfi-base:latest
#4 DONE 2.1s

#5 [linux/amd64 internal] load metadata for cgr.dev/chainguard/wolfi-base:latest
#5 ...

#6 [internal] load .dockerignore
#6 transferring context: 2B done
#6 DONE 0.0s

#5 [linux/amd64 internal] load metadata for cgr.dev/chainguard/wolfi-base:latest
#5 DONE 3.3s

#7 [linux/amd64 gocd-server-unzip 1/3] FROM docker.io/curlimages/curl:latest@sha256:d9b4541e214bcd85196d6e92e2753ac6d0ea699f0af5741f8c6cccbfcf00ef4b
#7 resolve docker.io/curlimages/curl:latest@sha256:d9b4541e214bcd85196d6e92e2753ac6d0ea699f0af5741f8c6cccbfcf00ef4b done
#7 sha256:4ca545ee6d5db5c1170386eeb39b2ffe3bd46e5d4a73a9acbebc805f19607eb3 42B / 42B 0.1s done
#7 sha256:b68d62cb323c5e6dbfa1dc8c990a0d1dba4690da661d8eae9af00943074770c0 5.79MB / 5.79MB 0.2s done
#7 sha256:43c4264eed91be63b206e17d93e75256a6097070ce643c5e8f0379998b44f170 0B / 3.62MB 0.2s
#7 sha256:43c4264eed91be63b206e17d93e75256a6097070ce643c5e8f0379998b44f170 3.62MB / 3.62MB 0.3s done
#7 extracting sha256:43c4264eed91be63b206e17d93e75256a6097070ce643c5e8f0379998b44f170 0.0s done
#7 DONE 0.4s

#8 [linux/arm64 gocd-server-unzip 1/3] FROM docker.io/curlimages/curl:latest@sha256:d9b4541e214bcd85196d6e92e2753ac6d0ea699f0af5741f8c6cccbfcf00ef4b
#8 resolve docker.io/curlimages/curl:latest@sha256:d9b4541e214bcd85196d6e92e2753ac6d0ea699f0af5741f8c6cccbfcf00ef4b done
#8 sha256:cf04c63912e16506c4413937c7f4579018e4bb25c272d989789cfba77b12f951 4.09MB / 4.09MB 0.2s done
#8 extracting sha256:cf04c63912e16506c4413937c7f4579018e4bb25c272d989789cfba77b12f951 0.1s done
#8 sha256:dfaa665a104a4eec724084693d3e01fde629574b283665180614c60be0365fd1 5.83MB / 5.83MB 0.2s done
#8 extracting sha256:dfaa665a104a4eec724084693d3e01fde629574b283665180614c60be0365fd1 done

#8 DONE 0.4s
> Task :docker:gocd-server:wolfi-latest:docker FAILED

#9 [linux/arm64 gocd-server-unzip 2/3] COPY go-server-24.4.0-19650.zip /tmp/go-server-24.4.0-19650.zip
#9 ERROR: mount callback failed on /run/user/1000/containerd-mount2444245211: failed to setxattr "/run/user/1000/containerd-mount2444245211/etc" for key "user.overlay.impure": operation not permitted

...

Dockerfile:24
--------------------
  22 |     ARG TARGETARCH
  23 |     ARG UID=1000
  24 | >>> COPY go-server-24.4.0-19650.zip /tmp/go-server-24.4.0-19650.zip
  25 |     RUN \
  26 |         unzip -q /tmp/go-server-24.4.0-19650.zip -d / && \
--------------------
ERROR: failed to solve: failed to compute cache key: mount callback failed on /run/user/1000/containerd-mount2444245211: failed to setxattr "/run/user/1000/containerd-mount2444245211/etc" for key "user.overlay.impure": operation not permitted

If this is related to the outdated host Docker version and this expected on this configuration, feel free to close/ignore and let me know.

Perhaps related to https://github.com/moby/moby/pull/47605 and/or https://github.com/moby/moby/issues/43626 and use of native overlay?

AkihiroSuda commented 2 weeks ago

Please post a (minimized) complete reproducer.

Is this issue specific to Dockerfiles that refer to curlimages/curl?

chadlwilson commented 2 weeks ago

I'll see what I can do to get a reproducer. Is there any reason to believe this would be a problem with 0.17.0 but not 0.16.0 and seemingly only on arm64 builds? (edit: also failing for amd64 actually)

Nothing else seemed to change and rolling back the buildkit image works fine. I couldn't see anything in the 0.17.0 changelog that might obviously imply a change here, other than perhaps the go lib updates to various things.

chadlwilson commented 2 weeks ago

There's a relatively minimal reproducer at https://github.com/chadlwilson/buildkit-rootless-issue incl GHA workflow that shows the problem with an Ubuntu 24.04 host.

Actions run that shows error: https://github.com/chadlwilson/buildkit-rootless-issue/actions/runs/11667854159/job/32486137082

Diff of docker info for docker:27-dind run our gocddev/gocd-dev-build:centos-9-v3.19.5 (our tooling image)

$ diff official-docker.txt centos-docker.txt
1c1
< Client:
---
> Client: Docker Engine - Community
8,12c8
<     Path:     /usr/local/libexec/docker/cli-plugins/docker-buildx
<   compose: Docker Compose (Docker Inc.)
<     Version:  v2.30.1
<     Path:     /usr/local/libexec/docker/cli-plugins/docker-compose
<
---
>     Path:     /usr/libexec/docker/cli-plugins/docker-buildx
20,25c16
<  Storage Driver: overlay2
<   Backing Filesystem: extfs
<   Supports d_type: true
<   Using metacopy: false
<   Native Overlay Diff: false
<   userxattr: false
---
>  Storage Driver: vfs
45c36
<  Operating System: Alpine Linux v3.20
---
>  Operating System: CentOS Stream 9
50,51c41,42
<  Name: 77e260121212
<  ID: 65e06b61-35b9-489b-8042-db38e2593de5
---
>  Name: f807bea0881d
>  ID: 5d1f9b14-1f29-422e-a6e8-9701ca8b4bd6

Diff of buildkit config on docker:27-dind run vs gocddev/gocd-dev-build:centos-9-v3.19.5 (our tooling image)

3,4c3
< Last Activity: 2024-11-04 15:29:21 +0000 UTC
<
---
> Last Activity: 2024-11-04 15:43:50 +0000 UTC
15c14
<  org.mobyproject.buildkit.worker.hostname:         fdc4515ccde9
---
>  org.mobyproject.buildkit.worker.hostname:         c226a50182b5
19c18
<  org.mobyproject.buildkit.worker.snapshotter:      overlayfs
---
>  org.mobyproject.buildkit.worker.snapshotter:      fuse-overlayfs

Not quite sure what to conclude about these differences:

chadlwilson commented 1 week ago

v0.17.1 has the same problem (and perhaps surprisingly was marked as stable)

AkihiroSuda commented 1 week ago

Can you try if this works?

diff --git a/vendor/github.com/containerd/containerd/archive/tar.go b/vendor/github.com/containerd/containerd/archive/tar.go
index c61f89ec8..4c16ee810 100644
--- a/vendor/github.com/containerd/containerd/archive/tar.go
+++ b/vendor/github.com/containerd/containerd/archive/tar.go
@@ -408,6 +408,12 @@ func createTarFile(ctx context.Context, path, extractDir string, hdr *tar.Header
                        key = key[len(paxSchilyXattr):]
                        if err := setxattr(path, key, value); err != nil {
                                if errors.Is(err, syscall.EPERM) && strings.HasPrefix(key, userXattrPrefix) {
+                                       if key == "user.overlay.impure" {
+                                               // Only occurs with images built with Red Hat's buildah?
+                                               // https://github.com/moby/buildkit/issues/5478
+                                               log.G(ctx).WithError(err).Debugf("ignored xattr %s in archive", key)
+                                               continue
+                                       }
                                        // In the user.* namespace, only regular files and directories can have extended attributes.
                                        // See https://man7.org/linux/man-pages/man7/xattr.7.html for details.
                                        if fi, err := os.Lstat(path); err == nil && (!fi.Mode().IsRegular() && !fi.Mode().IsDir()) {
chadlwilson commented 1 week ago

I don't have the ability to build from source right now, never done that before. Not sure how easy it is.

FWIW, problem was introduced between rc1 and rc2 so somewhere in https://github.com/moby/buildkit/compare/v0.17.0-rc1...v0.17.0-rc2

AkihiroSuda commented 1 week ago

I don't have the ability to build from source right now, never done that before.

go build ./cmd/buildkitd
AkihiroSuda commented 1 week ago

somewhere in https://github.com/moby/buildkit/compare/v0.17.0-rc1...v0.17.0-rc2

Could you try git bisect?

chadlwilson commented 1 week ago

somewhere in v0.17.0-rc1...v0.17.0-rc2

Could you try git bisect?

Sure, if i have a build env set up. :) Also need to create a container image to give to buildx as well, and do so consistently with how the images here are built.

I don't have a local env to replicate this right now, so have to iterate on cloud infra which is slow.

chadlwilson commented 1 week ago

somewhere in v0.17.0-rc1...v0.17.0-rc2

Could you try git bisect?

Tried bisecting, and can't replicate the issue. rootless images rebuilt now with old source also fail, presumably due to not being reproducible. This is a clue :-)

$ docker scout compare moby/buildkit:v0.17.0-rc1-rootless --to moby/buildkit:v0.17.0-rc2-rootless --ignore-unchanged
    i New version 1.15.0 available (installed version is 1.14.0) at https://github.com/docker/scout-cli
          ! 'docker scout compare' is experimental and its behaviour might change in the future
    ✓ SBOM of image already cached, 443 packages indexed
    ✓ SBOM of image already cached, 442 packages indexed

  ## Overview

                      │                       Analyzed Image                       │                      Comparison Image
  ────────────────────┼────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────
    Target            │  moby/buildkit:v0.17.0-rc1-rootless                        │  moby/buildkit:v0.17.0-rc2-rootless
      digest          │  21066164dc5c                                              │  2ab2f8bf0811
      tag             │  v0.17.0-rc1-rootless                                      │  v0.17.0-rc2-rootless
      platform        │ linux/arm64                                                │ linux/arm64
      provenance      │ https://github.com/moby/buildkit.git#refs/tags/v0.17.0-rc1 │ https://github.com/moby/buildkit.git#refs/tags/v0.17.0-rc2
                      │  62bda5c1caae9935a2051e96443d554f7ab7ef2d                  │  d09c1e2960a87448e2b8b7e2e9e7509671225cee
      vulnerabilities │    0C     4H     6M     1L                                 │    0C     4H     4M     1L
                      │                  +2                                        │
      size            │ 96 MB (-376 kB)                                            │ 97 MB
      packages        │ 443 (+1)                                                   │ 442
                      │                                                            │
    Base image        │  alpine:3.20                                               │  alpine:3.20
      tags            │ also known as                                              │ also known as
                      │   • 3                                                      │   • 3
                      │   • 3.20.3                                                 │   • 3.20.3
                      │   • latest                                                 │   • latest
      vulnerabilities │    0C     0H     1M     0L                                 │    0C     0H     1M     0L

  ## Environment Variables

      BUILDKIT_HOST=unix:///run/user/1000/buildkit/buildkitd.sock
      HOME=/home/user
      PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
      TMPDIR=/home/user/.local/tmp
      USER=user
      XDG_RUNTIME_DIR=/run/user/1000

  ## Packages and Vulnerabilities

    +    1 packages added
    ⎌   20 packages changed (↑ 0 upgraded, ↓ 20 downgraded)
       171 packages unchanged

    + 2 vulnerabilities added

     Package                                                     Type    Version                            Compared Version

  ↓  fuse-overlayfs                                              apk     1.13-r0                            1.14-r0
  ↓  github.com/azuread/microsoft-authentication-library-for-go  golang  0.6.0                              1.2.2
  ↓  github.com/containerd/continuity                            golang  0.4.3                              0.4.4
  ↓  github.com/cpuguy83/go-md2man/v2                            golang  2.0.4                              2.0.5
  ↓  github.com/golang-jwt/jwt/v4                                golang  4.5.0                              5.2.1
  ↓  github.com/insomniacslk/dhcp                                golang  0.0.0-20230516061539-49801966e6cb  0.0.0-20240812123929-b105c29bd1b5
  +  github.com/josharian/native                                 golang  1.1.0
  ↓  github.com/klauspost/compress                               golang  1.17.9                             1.17.11
  ↓  github.com/moby/buildkit                                    golang  0.17.0-rc1                         0.17.0-rc2
  ↓  github.com/opencontainers/runc                              golang  0.0.0-20240903011541-2c9f5602f0ba  0.0.0-20241004193727-bc20cb4497af
  ↓  github.com/pierrec/lz4/v4                                   golang  4.1.17                             4.1.21
  ↓  github.com/rootless-containers/rootlesskit/v2               golang  0.0.0-20240305214756-9e7dd3380db2  0.0.0-20240817192505-fcc67feacd7d
  ↓  github.com/tonistiigi/fsutil                                golang  0.0.0-20241003195857-3f140a1299b0  0.0.0-20241028165955-397af5306b5c
  ↓  github.com/u-root/uio                                       golang  0.0.0-20230305220412-3e8cd9d6bf63  0.0.0-20240224005618-d2acac8f3701
  ↓  github.com/urfave/cli                                       golang  1.22.15                            1.22.16
  ↓  github.com/xrash/smetrics                                   golang  0.0.0-20201216005158-039620a65673  0.0.0-20240521201337-686a1a2994c1
  ↓  golang.org/x/crypto                                         golang  0.26.0                             0.27.0
  ↓  golang.org/x/net                                            golang  0.28.0                             0.29.0
  ↓  golang.org/x/sys                                            golang  0.24.0                             0.26.0
  ↓  golang.org/x/text                                           golang  0.17.0                             0.18.0
  ↓  google.golang.org/protobuf                                  golang  1.34.1                             1.35.1

The change here that appears to have caused the problem seems to be fuse-overlayfs being upgraded to 1.14 on Alpine 1.20 at some point (9th October possibly?).

While it's a bit messy since apks on Alpine can't be downgraded, running on latest buildkit code and downgrading to Alpine 3.19 brings in the older fuse-overlayfs version 1.13 which seems to work OK... implying the real root cause of the change in behaviour is something within https://github.com/containers/fuse-overlayfs/compare/v1.13...v1.14 (but might not be the root cause of the problem)

Can you try if this works?

This slight adjustment seemed to work with the newer fuse-overlayfs 1.14... (adding user.overlay.origin) and using make images to run with moby/buildkit:local-rootless (BuildKit version: v0.17.0-30-gc9a17ff81.m)

diff --git a/vendor/github.com/containerd/containerd/archive/tar.go b/vendor/github.com/containerd/containerd/archive/tar.go
index c61f89ec8..2dc53fb1d 100644
--- a/vendor/github.com/containerd/containerd/archive/tar.go
+++ b/vendor/github.com/containerd/containerd/archive/tar.go
@@ -408,6 +408,12 @@ func createTarFile(ctx context.Context, path, extractDir string, hdr *tar.Header
            key = key[len(paxSchilyXattr):]
            if err := setxattr(path, key, value); err != nil {
                if errors.Is(err, syscall.EPERM) && strings.HasPrefix(key, userXattrPrefix) {
+                                       if key == "user.overlay.impure" || key == "user.overlay.origin" {
+                                               // Only occurs with images built with Red Hat's buildah?
+                                               // https://github.com/moby/buildkit/issues/5478
+                                               log.G(ctx).WithError(err).Debugf("ignored xattr %s in archive", key)
+                                               continue
+                                       }
                    // In the user.* namespace, only regular files and directories can have extended attributes.
                    // See https://man7.org/linux/man-pages/man7/xattr.7.html for details.
                    if fi, err := os.Lstat(path); err == nil && (!fi.Mode().IsRegular() && !fi.Mode().IsDir()) {
chadlwilson commented 1 week ago

So anyway, the curl image (in this case) does seem to be the problem, as it's built with buildah 1.23.1 on default Github Ubuntu 22.04 runners. Even Ubuntu 24.04 doesn't have a buildah with the required fix (only has 1.33.7) - the buildah releases would need https://github.com/containers/storage/commit/eadc620 via something like https://github.com/containers/storage/pull/1847#issuecomment-1973845554

So I don't know. Perhaps this is a "won't fix" from the perspective of Docker, moby, fuse-overlayfs etc. But what a PITA. :-)