Open chadlwilson opened 3 weeks ago
Please post a (minimized) complete reproducer.
Is this issue specific to Dockerfiles that refer to curlimages/curl
?
I'll see what I can do to get a reproducer. Is there any reason to believe this would be a problem with 0.17.0 but not 0.16.0 and seemingly only on arm64 builds? (edit: also failing for amd64 actually)
Nothing else seemed to change and rolling back the buildkit image works fine. I couldn't see anything in the 0.17.0 changelog that might obviously imply a change here, other than perhaps the go lib updates to various things.
There's a relatively minimal reproducer at https://github.com/chadlwilson/buildkit-rootless-issue incl GHA workflow that shows the problem with an Ubuntu 24.04 host.
0.17.0
(OK on 0.16.0
)docker:27-dind
image but NOT our CentOS Stream based one with a basic RPM-installed Docker.linux/arm64
on an amd64/x64 hostActions run that shows error: https://github.com/chadlwilson/buildkit-rootless-issue/actions/runs/11667854159/job/32486137082
Diff of docker info for docker:27-dind
run our gocddev/gocd-dev-build:centos-9-v3.19.5
(our tooling image)
$ diff official-docker.txt centos-docker.txt
1c1
< Client:
---
> Client: Docker Engine - Community
8,12c8
< Path: /usr/local/libexec/docker/cli-plugins/docker-buildx
< compose: Docker Compose (Docker Inc.)
< Version: v2.30.1
< Path: /usr/local/libexec/docker/cli-plugins/docker-compose
<
---
> Path: /usr/libexec/docker/cli-plugins/docker-buildx
20,25c16
< Storage Driver: overlay2
< Backing Filesystem: extfs
< Supports d_type: true
< Using metacopy: false
< Native Overlay Diff: false
< userxattr: false
---
> Storage Driver: vfs
45c36
< Operating System: Alpine Linux v3.20
---
> Operating System: CentOS Stream 9
50,51c41,42
< Name: 77e260121212
< ID: 65e06b61-35b9-489b-8042-db38e2593de5
---
> Name: f807bea0881d
> ID: 5d1f9b14-1f29-422e-a6e8-9701ca8b4bd6
Diff of buildkit config on docker:27-dind
run vs gocddev/gocd-dev-build:centos-9-v3.19.5
(our tooling image)
3,4c3
< Last Activity: 2024-11-04 15:29:21 +0000 UTC
<
---
> Last Activity: 2024-11-04 15:43:50 +0000 UTC
15c14
< org.mobyproject.buildkit.worker.hostname: fdc4515ccde9
---
> org.mobyproject.buildkit.worker.hostname: c226a50182b5
19c18
< org.mobyproject.buildkit.worker.snapshotter: overlayfs
---
> org.mobyproject.buildkit.worker.snapshotter: fuse-overlayfs
Not quite sure what to conclude about these differences:
overlay2
vs vfs
for the docker DIND image?overlayfs
vs fuse-overlayfs
for buildkit snapshotter?v0.17.1 has the same problem (and perhaps surprisingly was marked as stable)
Can you try if this works?
diff --git a/vendor/github.com/containerd/containerd/archive/tar.go b/vendor/github.com/containerd/containerd/archive/tar.go
index c61f89ec8..4c16ee810 100644
--- a/vendor/github.com/containerd/containerd/archive/tar.go
+++ b/vendor/github.com/containerd/containerd/archive/tar.go
@@ -408,6 +408,12 @@ func createTarFile(ctx context.Context, path, extractDir string, hdr *tar.Header
key = key[len(paxSchilyXattr):]
if err := setxattr(path, key, value); err != nil {
if errors.Is(err, syscall.EPERM) && strings.HasPrefix(key, userXattrPrefix) {
+ if key == "user.overlay.impure" {
+ // Only occurs with images built with Red Hat's buildah?
+ // https://github.com/moby/buildkit/issues/5478
+ log.G(ctx).WithError(err).Debugf("ignored xattr %s in archive", key)
+ continue
+ }
// In the user.* namespace, only regular files and directories can have extended attributes.
// See https://man7.org/linux/man-pages/man7/xattr.7.html for details.
if fi, err := os.Lstat(path); err == nil && (!fi.Mode().IsRegular() && !fi.Mode().IsDir()) {
I don't have the ability to build from source right now, never done that before. Not sure how easy it is.
FWIW, problem was introduced between rc1 and rc2 so somewhere in https://github.com/moby/buildkit/compare/v0.17.0-rc1...v0.17.0-rc2
I don't have the ability to build from source right now, never done that before.
go build ./cmd/buildkitd
somewhere in https://github.com/moby/buildkit/compare/v0.17.0-rc1...v0.17.0-rc2
Could you try git bisect
?
somewhere in v0.17.0-rc1...v0.17.0-rc2
Could you try
git bisect
?
Sure, if i have a build env set up. :) Also need to create a container image to give to buildx as well, and do so consistently with how the images here are built.
I don't have a local env to replicate this right now, so have to iterate on cloud infra which is slow.
somewhere in v0.17.0-rc1...v0.17.0-rc2
Could you try
git bisect
?
Tried bisecting, and can't replicate the issue. rootless
images rebuilt now with old source also fail, presumably due to not being reproducible. This is a clue :-)
$ docker scout compare moby/buildkit:v0.17.0-rc1-rootless --to moby/buildkit:v0.17.0-rc2-rootless --ignore-unchanged
i New version 1.15.0 available (installed version is 1.14.0) at https://github.com/docker/scout-cli
! 'docker scout compare' is experimental and its behaviour might change in the future
✓ SBOM of image already cached, 443 packages indexed
✓ SBOM of image already cached, 442 packages indexed
## Overview
│ Analyzed Image │ Comparison Image
────────────────────┼────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────
Target │ moby/buildkit:v0.17.0-rc1-rootless │ moby/buildkit:v0.17.0-rc2-rootless
digest │ 21066164dc5c │ 2ab2f8bf0811
tag │ v0.17.0-rc1-rootless │ v0.17.0-rc2-rootless
platform │ linux/arm64 │ linux/arm64
provenance │ https://github.com/moby/buildkit.git#refs/tags/v0.17.0-rc1 │ https://github.com/moby/buildkit.git#refs/tags/v0.17.0-rc2
│ 62bda5c1caae9935a2051e96443d554f7ab7ef2d │ d09c1e2960a87448e2b8b7e2e9e7509671225cee
vulnerabilities │ 0C 4H 6M 1L │ 0C 4H 4M 1L
│ +2 │
size │ 96 MB (-376 kB) │ 97 MB
packages │ 443 (+1) │ 442
│ │
Base image │ alpine:3.20 │ alpine:3.20
tags │ also known as │ also known as
│ • 3 │ • 3
│ • 3.20.3 │ • 3.20.3
│ • latest │ • latest
vulnerabilities │ 0C 0H 1M 0L │ 0C 0H 1M 0L
## Environment Variables
BUILDKIT_HOST=unix:///run/user/1000/buildkit/buildkitd.sock
HOME=/home/user
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
TMPDIR=/home/user/.local/tmp
USER=user
XDG_RUNTIME_DIR=/run/user/1000
## Packages and Vulnerabilities
+ 1 packages added
⎌ 20 packages changed (↑ 0 upgraded, ↓ 20 downgraded)
171 packages unchanged
+ 2 vulnerabilities added
Package Type Version Compared Version
↓ fuse-overlayfs apk 1.13-r0 1.14-r0
↓ github.com/azuread/microsoft-authentication-library-for-go golang 0.6.0 1.2.2
↓ github.com/containerd/continuity golang 0.4.3 0.4.4
↓ github.com/cpuguy83/go-md2man/v2 golang 2.0.4 2.0.5
↓ github.com/golang-jwt/jwt/v4 golang 4.5.0 5.2.1
↓ github.com/insomniacslk/dhcp golang 0.0.0-20230516061539-49801966e6cb 0.0.0-20240812123929-b105c29bd1b5
+ github.com/josharian/native golang 1.1.0
↓ github.com/klauspost/compress golang 1.17.9 1.17.11
↓ github.com/moby/buildkit golang 0.17.0-rc1 0.17.0-rc2
↓ github.com/opencontainers/runc golang 0.0.0-20240903011541-2c9f5602f0ba 0.0.0-20241004193727-bc20cb4497af
↓ github.com/pierrec/lz4/v4 golang 4.1.17 4.1.21
↓ github.com/rootless-containers/rootlesskit/v2 golang 0.0.0-20240305214756-9e7dd3380db2 0.0.0-20240817192505-fcc67feacd7d
↓ github.com/tonistiigi/fsutil golang 0.0.0-20241003195857-3f140a1299b0 0.0.0-20241028165955-397af5306b5c
↓ github.com/u-root/uio golang 0.0.0-20230305220412-3e8cd9d6bf63 0.0.0-20240224005618-d2acac8f3701
↓ github.com/urfave/cli golang 1.22.15 1.22.16
↓ github.com/xrash/smetrics golang 0.0.0-20201216005158-039620a65673 0.0.0-20240521201337-686a1a2994c1
↓ golang.org/x/crypto golang 0.26.0 0.27.0
↓ golang.org/x/net golang 0.28.0 0.29.0
↓ golang.org/x/sys golang 0.24.0 0.26.0
↓ golang.org/x/text golang 0.17.0 0.18.0
↓ google.golang.org/protobuf golang 1.34.1 1.35.1
The change here that appears to have caused the problem seems to be fuse-overlayfs
being upgraded to 1.14 on Alpine 1.20
at some point (9th October possibly?).
While it's a bit messy since apks on Alpine can't be downgraded, running on latest buildkit code and downgrading to Alpine 3.19
brings in the older fuse-overlayfs
version 1.13
which seems to work OK... implying the real root cause of the change in behaviour is something within https://github.com/containers/fuse-overlayfs/compare/v1.13...v1.14 (but might not be the root cause of the problem)
Can you try if this works?
This slight adjustment seemed to work with the newer fuse-overlayfs 1.14
... (adding user.overlay.origin
) and using make images
to run with moby/buildkit:local-rootless
(BuildKit version: v0.17.0-30-gc9a17ff81.m
)
diff --git a/vendor/github.com/containerd/containerd/archive/tar.go b/vendor/github.com/containerd/containerd/archive/tar.go
index c61f89ec8..2dc53fb1d 100644
--- a/vendor/github.com/containerd/containerd/archive/tar.go
+++ b/vendor/github.com/containerd/containerd/archive/tar.go
@@ -408,6 +408,12 @@ func createTarFile(ctx context.Context, path, extractDir string, hdr *tar.Header
key = key[len(paxSchilyXattr):]
if err := setxattr(path, key, value); err != nil {
if errors.Is(err, syscall.EPERM) && strings.HasPrefix(key, userXattrPrefix) {
+ if key == "user.overlay.impure" || key == "user.overlay.origin" {
+ // Only occurs with images built with Red Hat's buildah?
+ // https://github.com/moby/buildkit/issues/5478
+ log.G(ctx).WithError(err).Debugf("ignored xattr %s in archive", key)
+ continue
+ }
// In the user.* namespace, only regular files and directories can have extended attributes.
// See https://man7.org/linux/man-pages/man7/xattr.7.html for details.
if fi, err := os.Lstat(path); err == nil && (!fi.Mode().IsRegular() && !fi.Mode().IsDir()) {
So anyway, the curl image (in this case) does seem to be the problem, as it's built with buildah 1.23.1
on default Github Ubuntu 22.04 runners. Even Ubuntu 24.04 doesn't have a buildah with the required fix (only has 1.33.7
) - the buildah releases would need https://github.com/containers/storage/commit/eadc620 via something like https://github.com/containers/storage/pull/1847#issuecomment-1973845554
So I don't know. Perhaps this is a "won't fix" from the perspective of Docker, moby, fuse-overlayfs etc. But what a PITA. :-)
When building via buildx our builds starting giving errors like the below after an implicit upgrade to use the
moby/buildkit:v0.17.0-rootless
image (previouslyv0.16.0-rootless
).#9 ERROR: mount callback failed on /run/user/1000/containerd-mount2444245211: failed to setxattr "/run/user/1000/containerd-mount2444245211/etc" for key "user.overlay.impure": operation not permitted
Environment is
Host OS:
Linux 5.10.219-208.866.amzn2.x86_64 amd64
(Amazon Linux 2) Host Docker:20.10.27
(yeah, I know it's EOL - long story) DIND image OS: Centos Stream 9 (if it matters) DIND image Docker:27.3.1
DIND image Docker buildx plugin:0.17.1
Fuller log
If this is related to the outdated host Docker version and this expected on this configuration, feel free to close/ignore and let me know.
Perhaps related to https://github.com/moby/moby/pull/47605 and/or https://github.com/moby/moby/issues/43626 and use of native overlay?