moby / buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
https://github.com/moby/moby/issues/34227
Apache License 2.0
8.17k stars 1.16k forks source link

docker build fails with buildx: buildkit: kubernetes driver; only with ARM #1929

Open matthewhembree opened 3 years ago

matthewhembree commented 3 years ago

I am having an issue installing Debian packages in the build context.

This happens when targeting an linux/arm64 or linux/arm/v7 platform. I do not have the same issue targeting linux/amd64 platforms.

Some more background:

This only happens in a Docker buildx kubernetes driver environment.

When in a Docker buildx docker driver environment, builds work fine.

When run in a Docker run environment, everything runs fine. Both linux/arm64 and linux/amd64 platforms.

The behavior similarly fails between Debian Buster and Stretch in my tests.

Scenarios:

docker:
     linux/amd64:
         build: ok
         run: ok
     linux/arm64:
         build: fail
         run: ok
     linux/arm/v7 (armhf):
         build: fail
         run: ok
     other platforms: unknown

Dockerfile:

FROM debian:buster-slim
#FROM debian:stretch-slim

#ARG TARGETARCH
#ARG DEBIAN_FRONTEND=noninteractive

#SHELL ["/bin/bash", "-c"]

RUN env
RUN set
RUN uname -a
RUN apt-get update \
    && apt-get install --no-install-recommends -y \
        netbase

Reproduction:

Build commands:

Kubernetes driver:
    Setup (Image tags tried: nightly, master, buildx-stable-1(latest)):
        ```
        docker buildx create --use --name ${USER}-builder --node ${USER}-builder-0 --driver kubernetes --driver-opt namespace=default,replicas=2,image=moby/buildkit:nightly --platform linux/amd64,linux/arm64
        ```
    Fail:
        ```
        docker buildx build -f Dockerfile -t debian:dpkg-testing --platform linux/arm64 --progress plain --load --no-cache .
        ```
    Succeed:
        ```
        docker buildx build -f Dockerfile -t debian:dpkg-testing --platform linux/amd64 --progress plain --load --no-cache .
        ```

Docker driver (Docker Desktop for Mac):
    Succeed:
        ```
        # linux/arm64:
        docker buildx build -f Dockerfile -t debian:dpkg-testing --platform linux/arm64 --progress plain --load --no-cache .
        # linux/amd64:
        docker buildx build -f Dockerfile -t debian:dpkg-testing --platform linux/amd64 --progress plain --load --no-cache .
        ```

Run commands:

Succeed:
    ```
    # linux/arm64:
    docker run --platform linux/arm64 -it debian:buster-slim bash # running commands interactively
    docker run --platform linux/arm64 -it debian:buster-slim sh # running commands interactively
    docker run --platform linux/arm64 debian:buster-slim sh -c 'export DEBIAN_FRONTEND=noninteractive && apt-get update && apt-get install --no-install-recommends -y netbase'
    docker run --platform linux/arm64 debian:buster-slim sh -c 'apt-get update && apt-get install --no-install-recommends -y netbase'
    # linux/amd64:
    docker run --platform linux/amd64 -it debian:buster-slim bash # running commands interactively
    docker run --platform linux/amd64 -it debian:buster-slim sh # running commands interactively
    ```

Logs:

kubernetes driver:

docker buildx build -f Dockerfile -t debian:dpkg-testing --platform linux/arm64 --progress plain --no-cache .
WARN[0000] No output specified for kubernetes driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
#1 [internal] load build definition from Dockerfile.dpkg-testing
#1 sha256:dbccc1991ff1047298dcdf39f73359149a65e3d3e25b23cc94ebdbd41b37a4b5
#1 transferring dockerfile: 43B 0.1s
#1 ...

#2 [internal] load .dockerignore
#2 sha256:30ea6c56106ebe169ad19c26478197acc7b7cc239024fff7539365dfbf6b7771
#2 transferring context: 34B 0.3s done
#2 DONE 0.3s

#1 [internal] load build definition from Dockerfile.dpkg-testing
#1 sha256:dbccc1991ff1047298dcdf39f73359149a65e3d3e25b23cc94ebdbd41b37a4b5
#1 transferring dockerfile: 346B 0.4s done
#1 DONE 0.5s

#3 [internal] load metadata for docker.io/library/debian:buster-slim
#3 sha256:33678b575aeed32981d29cc62f8ac2b3a01ee77f5da143c4ad022934891ce693
#3 ...

#4 [auth] library/debian:pull token for registry-1.docker.io
#4 sha256:cfbe86b7c0f89d7ebf1daaaa28c9a997aa586293283d1d635b82e6f4bdbcd06d
#4 DONE 0.0s

#3 [internal] load metadata for docker.io/library/debian:buster-slim
#3 sha256:33678b575aeed32981d29cc62f8ac2b3a01ee77f5da143c4ad022934891ce693
#3 DONE 1.4s

#5 [1/5] FROM docker.io/library/debian:buster-slim@sha256:240f770008bdc538fecc8d3fa7a32a533eac55c14cbc56a9a8a6f7d741b47e33
#5 sha256:ab731873b2ba03c214280bacdb26440395dbd6dbbc4d5f4e25716a98c299034d
#5 resolve docker.io/library/debian:buster-slim@sha256:240f770008bdc538fecc8d3fa7a32a533eac55c14cbc56a9a8a6f7d741b47e33 0.0s done
#5 CACHED

#6 [2/5] RUN env
#6 sha256:83ff2c71bbb5eb936b568de4d65e83b81ad69fc4aadc074bd17a26b1f28564b6
#6 0.174 PWD=/
#6 0.174 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
#6 0.174 HOME=/root
#6 DONE 0.3s

#7 [3/5] RUN set
#7 sha256:9670a0eb3cb0e64b616bc7b3cc89a00c0e04e0fc56c7227a986e071f28701798
#7 0.195 HOME='/root'
#7 0.195 IFS='
#7 0.195 '
#7 0.195 OPTIND='1'
#7 0.195 PATH='/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'
#7 0.195 PPID='0'
#7 0.195 PS1='# '
#7 0.195 PS2='> '
#7 0.195 PS4='+ '
#7 0.195 PWD='/'
#7 DONE 0.3s

#8 [4/5] RUN uname -a
#8 sha256:cb1fcd8cb5bbe1e67cfacf54c01895430f63f2943f80130426fea84b58952df9
#8 0.224 Linux buildkitsandbox 4.14.198-152.320.amzn2.x86_64 #1 SMP Wed Sep 23 23:57:28 UTC 2020 aarch64 GNU/Linux
#8 DONE 0.3s

#9 [5/5] RUN apt-get update     && apt-get install --no-install-recommends -y         netbase
#9 sha256:c73eab51d0b7d56e9559e55c1ea6927b85a3808c175eec84317e3f4e69aef8e9
#9 0.553 Get:1 http://security.debian.org/debian-security buster/updates InRelease [65.4 kB]
#9 0.646 Get:2 http://deb.debian.org/debian buster InRelease [121 kB]
#9 0.647 Get:3 http://deb.debian.org/debian buster-updates InRelease [51.9 kB]
#9 2.436 Get:4 http://security.debian.org/debian-security buster/updates/main arm64 Packages [252 kB]
#9 4.182 Get:5 http://deb.debian.org/debian buster/main arm64 Packages [7737 kB]
#9 6.027 Get:6 http://deb.debian.org/debian buster-updates/main arm64 Packages [7848 B]
#9 7.384 Fetched 8236 kB in 7s (1202 kB/s)
#9 7.384 Reading package lists...
#9 10.54 Reading package lists...
#9 13.55 Building dependency tree...
#9 14.09 Reading state information...
#9 14.63 The following NEW packages will be installed:
#9 14.63   netbase
#9 14.79 0 upgraded, 1 newly installed, 0 to remove and 1 not upgraded.
#9 14.79 Need to get 19.4 kB of archives.
#9 14.79 After this operation, 45.1 kB of additional disk space will be used.
#9 14.79 Get:1 http://deb.debian.org/debian buster/main arm64 netbase all 5.6 [19.4 kB]
#9 15.40 debconf: delaying package configuration, since apt-utils is not installed
#9 15.50 Fetched 19.4 kB in 0s (222 kB/s)
#9 15.56 Error while loading /usr/sbin/dpkg-split: No such file or directory
#9 15.56 Error while loading /usr/sbin/dpkg-deb: No such file or directory
#9 15.56 dpkg: error processing archive /var/cache/apt/archives/netbase_5.6_all.deb (--unpack):
#9 15.56  dpkg-deb --control subprocess returned error exit status 1
#9 15.57 Errors were encountered while processing:
#9 15.57  /var/cache/apt/archives/netbase_5.6_all.deb
#9 15.64 E: Sub-process /usr/bin/dpkg returned an error code (1)
#9 ERROR: executor failed running [/dev/.buildkit_qemu_emulator /bin/sh -c apt-get update     && apt-get install --no-install-recommends -y         netbase]: exit code: 100
------
 > [5/5] RUN apt-get update     && apt-get install --no-install-recommends -y         netbase:
------
Dockerfile.dpkg-testing:12
--------------------
  11 |     RUN uname -a
  12 | >>> RUN apt-get update \
  13 | >>>     && apt-get install --no-install-recommends -y \
  14 | >>>         netbase
  15 |
--------------------
error: failed to solve: rpc error: code = Unknown desc = executor failed running [/dev/.buildkit_qemu_emulator /bin/sh -c apt-get update     && apt-get install --no-install-recommends -y         netbase]: exit code: 100

The following lines are interesting:

#9 15.56 Error while loading /usr/sbin/dpkg-split: No such file or directory
#9 15.56 Error while loading /usr/sbin/dpkg-deb: No such file or directory

They vary when running FROM debian:stretch-slim:

#9 16.00 Error while loading /usr/local/sbin/dpkg-split: No such file or directory
#9 16.00 Error while loading /usr/local/sbin/dpkg-deb: No such file or directory

I don't understand why this fails with the kubernetes driver (on linux/arm*). And why there's a variance with buster versus stretch. šŸ¤·

Environment:

Local environment:

I am using the latest buildx plugin.

docker system info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1)
  scan: Docker Scan (Docker Inc., v0.5.0)

Server:
 Containers: 36
  Running: 0
  Paused: 0
  Stopped: 36
 Images: 100
 Server Version: 20.10.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 269548fa27e0089a8b8278fc4fc781d7f65a939b
 runc version: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.19.121-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 3.847GiB
 Name: docker-desktop
 ID: 4NYL:QO3L:7DYW:H5MG:PIJW:7VBR:ECZC:DATV:IL3L:ILA6:ZPCZ:QGA5
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: gateway.docker.internal:3128
 HTTPS Proxy: gateway.docker.internal:3129
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  https://mirror.gcr.io/
 Live Restore Enabled: false
 Product License: Community Engine

Buildx environment:

docker buildx ls
NAME/NODE                                   DRIVER/ENDPOINT STATUS  PLATFORMS
username-builder *                   kubernetes
  username-builder-0-c8d85646d-pgzxh                 running linux/amd64*, linux/arm64*, linux/386
  username-builder-0-c8d85646d-w9cgr                 running linux/amd64*, linux/arm64*, linux/386
default                                     docker
  default                                   default         running linux/amd64, linux/arm64, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/arm/v7, linux/arm/v6

Thanks!

tonistiigi commented 3 years ago

Are these arm builds running through qemu. Do you have binfmt setup properly on these k8s nodes? Eg. https://github.com/tonistiigi/binfmt#installing-emulators

tonistiigi commented 3 years ago

https://github.com/docker/buildx/issues/495 looks related

matthewhembree commented 3 years ago

Are these arm builds running through qemu. Do you have binfmt setup properly on these k8s nodes? Eg. tonistiigi/binfmt#installing-emulators Yes, they should be. They're (arm64 builds) not native.

I'm just wondering what needs to be done for multiarch qemu builds with the k8s driver. Maybe it's something that needs to be done in the kubernetes driver setup https://github.com/docker/buildx/tree/master/driver/kubernetes. Or does something need to be integrated into https://github.com/moby/buildkit.

Or could it be something as simple as updating the buildx README.

tonistiigi commented 3 years ago

You need to install qemu on the nodes. Eg. with image from https://github.com/tonistiigi/binfmt#installing-emulators (distro packages usually do not work properly) . In buildx ls you should see all emulated platforms as you do for the default instance. * means you just set the platform manually on buildx create and will not work unless node is not properly setup.

@AkihiroSuda Maybe you could provide example k8s yaml for https://github.com/tonistiigi/binfmt that we could link to.

matthewhembree commented 3 years ago

It seems like this would best be done as a privileged init container in https://github.com/docker/buildx/tree/master/driver/kubernetes, since you would want this applied to /proc/sys/fs/binfmt_misc on whichever node the buildx worker pod exists on. That should work with rootless buildx pods. And would also require good documentation in the readme, so that people understand the requirements needed to accomplish this. I believe it would be safe to set binfmt for all architectures, but it could take platform parameters from buildx create.