docker / buildx

Docker CLI plugin for extended build capabilities with BuildKit
Apache License 2.0
3.33k stars 449 forks source link

segmentation fault cross-building target=amd64 host=arm64 #2028

Open MikeSpreitzer opened 10 months ago

MikeSpreitzer commented 10 months ago

Contributing guidelines

I've found a bug and checked that ...

Description

I have an Apple silicon (M2) MacOS 13.5.1 laptop. On it I installed UTM 4.3.5. I made a VM running Ubuntu 22.04.3 as the guest. In that guest I installed Docker Engine 24.0.5. In that guest docker buildx build --platform linux/arm64,linux/ppc64le succeeds but asking for linux/amd64 makes the go comipler crash with segmentation fault.

Expected behaviour

Cross-platform builds work.

Actual behaviour

The go compiler crashes multiple times with a segmentation fault.

Buildx version

github.com/docker/buildx v0.11.2 9872040

Docker info

Client: Docker Engine - Community
 Version:    24.0.5
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.20.2
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 3
  Running: 2
  Paused: 0
  Stopped: 1
 Images: 6
 Server Version: 24.0.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 8165feabfdfe38c65b599c4993d227328c231fca
 runc version: v1.1.8-0-g82f18fe
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.0-82-generic
 Operating System: Ubuntu 22.04.3 LTS
 OSType: linux
 Architecture: aarch64
 CPUs: 6
 Total Memory: 15.59GiB
 Name: ubu22b
 ID: a0e3aa8f-77da-4014-8f91-f112c4b84b14
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: mspreitz
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Builders list

NAME/NODE      DRIVER/ENDPOINT             STATUS  BUILDKIT             PLATFORMS
kubestellar    docker-container                                         
  kubestellar0 unix:///var/run/docker.sock running v0.12.1              linux/arm64
nice_allen *   docker-container                                         
  nice_allen0  unix:///var/run/docker.sock running v0.12.1              linux/arm64
default        docker                                                   
  default      default                     running v0.11.6+0a15675913b7 linux/arm64

Configuration

FROM golang:1.20 as builder

RUN mkdir -p /go/src/github.com/openshift && \
    cd /go/src/github.com/openshift && \
    git clone https://github.com/openshift/eventrouter

RUN cd /go/src/github.com/openshift/eventrouter && go mod tidy && CGO_ENABLED=0 go build .

FROM alpine:3.14 as run
WORKDIR /app
RUN apk update --no-cache && apk add ca-certificates
COPY --from=builder /go/src/github.com/openshift/eventrouter/eventrouter /app/eventrouter
USER nobody:nobody
CMD ["/bin/sh", "-c", "/app/eventrouter -v 3 -logtostderr"]

I built with the following command.

docker buildx build -t quay.io/mspreitz/eventrouter:latest --platform linux/amd64,linux/arm64,linux/ppc64le --push .

Build logs

That buildx command produced the following output.

[+] Building 231.5s (27/29)                                                 docker-container:nice_allen
 => [internal] load build definition from Dockerfile                                               0.0s
 => => transferring dockerfile: 738B                                                               0.0s
 => [linux/ppc64le internal] load metadata for docker.io/library/alpine:3.14                       0.2s
 => [linux/amd64 internal] load metadata for docker.io/library/alpine:3.14                         0.2s
 => [linux/amd64 internal] load metadata for docker.io/library/golang:1.20                         0.4s
 => [linux/ppc64le internal] load metadata for docker.io/library/golang:1.20                       0.2s
 => [linux/arm64 internal] load metadata for docker.io/library/alpine:3.14                         0.2s
 => [linux/arm64 internal] load metadata for docker.io/library/golang:1.20                         0.4s
 => [internal] load .dockerignore                                                                  0.0s
 => => transferring context: 2B                                                                    0.0s
 => CACHED [linux/ppc64le builder 1/3] FROM docker.io/library/golang:1.20@sha256:741d6f9bcab77844  0.0s
 => => resolve docker.io/library/golang:1.20@sha256:741d6f9bcab778441efe05c8e4369d4f8ff56c9a635a9  0.0s
 => [linux/ppc64le run 1/4] FROM docker.io/library/alpine:3.14@sha256:0f2d5c38dd7a4f4f733e688e3a6  0.0s
 => => resolve docker.io/library/alpine:3.14@sha256:0f2d5c38dd7a4f4f733e688e3a6733cb5ab1ac6e3cb46  0.0s
 => CACHED [linux/amd64 builder 1/3] FROM docker.io/library/golang:1.20@sha256:741d6f9bcab778441e  0.0s
 => => resolve docker.io/library/golang:1.20@sha256:741d6f9bcab778441efe05c8e4369d4f8ff56c9a635a9  0.0s
 => [linux/arm64 run 1/4] FROM docker.io/library/alpine:3.14@sha256:0f2d5c38dd7a4f4f733e688e3a673  0.0s
 => => resolve docker.io/library/alpine:3.14@sha256:0f2d5c38dd7a4f4f733e688e3a6733cb5ab1ac6e3cb46  0.0s
 => CACHED [linux/arm64 builder 1/3] FROM docker.io/library/golang:1.20@sha256:741d6f9bcab778441e  0.0s
 => => resolve docker.io/library/golang:1.20@sha256:741d6f9bcab778441efe05c8e4369d4f8ff56c9a635a9  0.0s
 => [linux/amd64 run 1/4] FROM docker.io/library/alpine:3.14@sha256:0f2d5c38dd7a4f4f733e688e3a673  0.0s
 => => resolve docker.io/library/alpine:3.14@sha256:0f2d5c38dd7a4f4f733e688e3a6733cb5ab1ac6e3cb46  0.0s
 => CACHED [linux/arm64 run 2/4] WORKDIR /app                                                      0.0s
 => CACHED [linux/arm64 run 3/4] RUN apk update --no-cache && apk add ca-certificates              0.0s
 => [linux/ppc64le builder 2/3] RUN mkdir -p /go/src/github.com/openshift &&     cd /go/src/gith  11.4s
 => CACHED [linux/ppc64le run 2/4] WORKDIR /app                                                    0.0s
 => CACHED [linux/ppc64le run 3/4] RUN apk update --no-cache && apk add ca-certificates            0.0s
 => [linux/arm64 builder 2/3] RUN mkdir -p /go/src/github.com/openshift &&     cd /go/src/github  11.7s
 => CACHED [linux/amd64 run 2/4] WORKDIR /app                                                      0.0s
 => CACHED [linux/amd64 run 3/4] RUN apk update --no-cache && apk add ca-certificates              0.0s
 => [linux/amd64 builder 2/3] RUN mkdir -p /go/src/github.com/openshift &&     cd /go/src/github  10.7s
 => ERROR [linux/amd64 builder 3/3] RUN cd /go/src/github.com/openshift/eventrouter && go mod t  220.2s
 => CANCELED [linux/ppc64le builder 3/3] RUN cd /go/src/github.com/openshift/eventrouter && go   219.6s
 => [linux/arm64 builder 3/3] RUN cd /go/src/github.com/openshift/eventrouter && go mod tidy &&   64.1s
 => [linux/arm64 run 4/4] COPY --from=builder /go/src/github.com/openshift/eventrouter/eventroute  0.1s
------
 > [linux/amd64 builder 3/3] RUN cd /go/src/github.com/openshift/eventrouter && go mod tidy && CGO_ENABLED=0 go build .:
0.179 go: downloading github.com/crewjam/rfc5424 v0.0.0-20180723152949-c25bdd3a0ba2
0.189 go: downloading github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b
0.240 go: downloading github.com/prometheus/client_golang v1.12.1
0.241 go: downloading k8s.io/api v0.27.1
0.242 go: downloading github.com/spf13/viper v1.4.0
0.559 go: downloading k8s.io/apimachinery v0.27.1
0.848 go: downloading k8s.io/client-go v0.27.1
0.925 go: downloading github.com/Shopify/sarama v1.23.1
0.927 go: downloading github.com/eapache/channels v1.1.0
0.960 go: downloading github.com/sethgrid/pester v0.0.0-20190127155807-68a33a018ad0
2.896 go: downloading github.com/prometheus/client_model v0.2.0
2.905 go: downloading github.com/prometheus/common v0.32.1
2.919 go: downloading github.com/beorn7/perks v1.0.1
2.922 go: downloading github.com/cespare/xxhash/v2 v2.1.2
2.957 go: downloading github.com/golang/protobuf v1.5.3
3.023 go: downloading github.com/prometheus/procfs v0.7.3
3.038 go: downloading golang.org/x/sys v0.6.0
3.096 go: downloading google.golang.org/protobuf v1.28.1
3.159 go: downloading github.com/fsnotify/fsnotify v1.4.9
3.166 go: downloading github.com/hashicorp/hcl v1.0.0
3.451 go: downloading github.com/magiconair/properties v1.8.0
3.499 go: downloading github.com/mitchellh/mapstructure v1.1.2
3.500 go: downloading github.com/pelletier/go-toml v1.2.0
3.557 go: downloading github.com/spf13/afero v1.2.2
3.578 go: downloading github.com/spf13/cast v1.3.0
3.616 go: downloading github.com/spf13/jwalterweatherman v1.0.0
3.634 go: downloading github.com/spf13/pflag v1.0.5
3.649 go: downloading gopkg.in/yaml.v2 v2.4.0
3.759 go: downloading github.com/eapache/queue v1.1.0
3.833 go: downloading k8s.io/klog/v2 v2.90.1
3.949 go: downloading github.com/DataDog/zstd v1.3.6-0.20190409195224-796139022798
4.124 go: downloading github.com/davecgh/go-spew v1.1.1
4.176 go: downloading github.com/eapache/go-resiliency v1.1.0
4.238 go: downloading github.com/eapache/go-xerial-snappy v0.0.0-20180814174437-776d5712da21
4.275 go: downloading github.com/jcmturner/gofork v0.0.0-20190328161633-dc7c13fece03
4.321 go: downloading github.com/pierrec/lz4 v0.0.0-20190327172049-315a67e90e41
4.348 go: downloading github.com/rcrowley/go-metrics v0.0.0-20181016184325-3113b8401b8a
4.396 go: downloading golang.org/x/net v0.8.0
4.435 go: downloading gopkg.in/jcmturner/gokrb5.v7 v7.2.3
9.354 go: downloading gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c
9.354 go: downloading github.com/stretchr/testify v1.8.1
23.32 go: downloading github.com/Shopify/toxiproxy v2.1.4+incompatible
23.32 go: downloading github.com/gogo/protobuf v1.3.2
23.32 go: downloading github.com/imdario/mergo v0.3.7
23.32 go: downloading golang.org/x/term v0.6.0
23.33 go: downloading sigs.k8s.io/yaml v1.3.0
23.39 go: downloading k8s.io/utils v0.0.0-20230209194617-a36077c30491
23.40 go: downloading github.com/google/go-cmp v0.5.9
23.72 go: downloading github.com/google/gofuzz v1.1.0
23.76 go: downloading github.com/matttproud/golang_protobuf_extensions v1.0.1
23.78 go: downloading github.com/BurntSushi/toml v0.3.1
23.79 go: downloading golang.org/x/text v0.8.0
23.80 go: downloading github.com/go-logr/logr v1.2.3
23.83 go: downloading github.com/golang/snappy v0.0.1
24.03 go: downloading github.com/kr/pretty v0.3.0
24.03 go: downloading github.com/pmezard/go-difflib v1.0.0
24.58 go: downloading gopkg.in/yaml.v3 v3.0.1
24.58 go: downloading github.com/hashicorp/go-uuid v1.0.1
24.71 go: downloading gopkg.in/jcmturner/dnsutils.v1 v1.0.1
24.84 go: downloading gopkg.in/jcmturner/goidentity.v3 v3.0.0
25.21 go: downloading gopkg.in/inf.v0 v0.9.1
25.29 go: downloading sigs.k8s.io/structured-merge-diff/v4 v4.2.3
25.47 go: downloading github.com/google/gnostic v0.5.7-v3refs
25.49 go: downloading k8s.io/kube-openapi v0.0.0-20230308215209-15aac26d736a
25.69 go: downloading github.com/google/uuid v1.3.0
26.18 go: downloading golang.org/x/time v0.0.0-20220210224613-90d013bbcef8
26.18 go: downloading golang.org/x/oauth2 v0.0.0-20220411215720-9780585627b5
26.66 go: downloading sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd
26.85 go: downloading github.com/kr/text v0.2.0
26.85 go: downloading github.com/rogpeppe/go-internal v1.10.0
26.89 go: downloading golang.org/x/crypto v0.0.0-20220315160706-3147a52a75dd
27.01 go: downloading gopkg.in/jcmturner/rpc.v1 v1.1.0
27.18 go: downloading github.com/json-iterator/go v1.1.12
27.59 go: downloading gopkg.in/jcmturner/aescts.v1 v1.0.1
27.59 go: downloading github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd
27.61 go: downloading github.com/modern-go/reflect2 v1.0.2
27.62 go: downloading github.com/go-openapi/swag v0.22.3
27.64 go: downloading github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822
27.66 go: downloading google.golang.org/appengine v1.6.7
27.67 go: downloading github.com/onsi/ginkgo/v2 v2.9.1
27.89 go: downloading github.com/onsi/gomega v1.27.4
27.99 go: downloading github.com/go-openapi/jsonreference v0.20.1
27.99 go: downloading github.com/emicklei/go-restful/v3 v3.9.0
27.99 go: downloading github.com/mailru/easyjson v0.7.7
28.02 go: downloading github.com/go-openapi/jsonpointer v0.19.6
28.07 go: downloading github.com/josharian/intern v1.0.0
28.15 go: downloading github.com/go-task/slim-sprig v0.0.0-20210107165309-348f09dbbbc0
28.15 go: downloading golang.org/x/tools v0.7.0
28.15 go: downloading github.com/google/pprof v0.0.0-20210720184732-4bb14d4b1be1
48.30 runtime/metrics: /usr/local/go/pkg/tool/linux_amd64/compile: signal: segmentation fault (core dumped)
135.3 k8s.io/apimachinery/pkg/util/yaml: /usr/local/go/pkg/tool/linux_amd64/compile: signal: segmentation fault (core dumped)
------
Dockerfile:10
--------------------
   8 |         git clone https://github.com/openshift/eventrouter
   9 |     
  10 | >>> RUN cd /go/src/github.com/openshift/eventrouter && go mod tidy && CGO_ENABLED=0 go build .
  11 |     
  12 |     FROM alpine:3.14 as run
--------------------
ERROR: failed to solve: process "/dev/.buildkit_qemu_emulator /bin/sh -c cd /go/src/github.com/openshift/eventrouter && go mod tidy && CGO_ENABLED=0 go build ." did not complete successfully: exit code: 1

Additional info

This is the same symptom as issue #314 and my Dockerfile here is an updated version of @carlosedp's one in that issue.

MikeSpreitzer commented 10 months ago

I also tried @benchonaut's Dockerfile from the opening comment of #314, but now that Dockerfile looks buggy: dpkg complains that "package libc-bin is already installed and configured" (how did that ever work?).

MikeSpreitzer commented 10 months ago

https://github.com/utmapp/UTM/releases/tag/v4.3.5 says, in the Changes section, "Rollback QEMU version to 7.2.0".

tonistiigi commented 9 months ago

Looks like a qemu error. You can try running the same process through strace in the docker container and it may show more precisely where it is breaking.

MikeSpreitzer commented 9 months ago

I do not think I can use strace inside a Dockerfile when building for another platform while inside a VM. Something about QEMU not implememting ptrace.

mspreitz@ubu22b:~/test4$ cat Dockerfile 
# Build with:
# docker buildx build -t carlosedp/eventrouter:latest --platform linux/amd64,linux/arm,linux/arm64,linux/arm,linux/ppc64le  -f Dockerfile-eventrouter --push .
#
FROM golang:1.21.1 as builder

RUN apt update && apt install -y strace
RUN mkdir -p /go/src/github.com/openshift && \
    cd /go/src/github.com/openshift && \
    git clone https://github.com/openshift/eventrouter

RUN strace -f -tt date

RUN cd /go/src/github.com/openshift/eventrouter && go mod tidy && CGO_ENABLED=0 strace -f -tt go build .

FROM alpine:3.14 as run
WORKDIR /app
RUN apk update --no-cache && apk add ca-certificates
COPY --from=builder /go/src/github.com/openshift/eventrouter/eventrouter /app/eventrouter
USER nobody:nobody
CMD ["/bin/sh", "-c", "/app/eventrouter -v 3 -logtostderr"]

mspreitz@ubu22b:~/test4$ docker buildx build --security-opt seccomp=unconfined -t quay.io/mspreitz/eventrouter:latest --platform linux/amd64 --push .
WARNING: security-opt flag is deprecated. "RUN --security=insecure" should be used with BuildKit.
[+] Building 0.5s (13/15)                           docker-container:nice_allen
 => [internal] load build definition from Dockerfile                       0.0s
 => => transferring dockerfile: 818B                                       0.0s
 => [internal] load metadata for docker.io/library/alpine:3.14             0.4s
 => [internal] load metadata for docker.io/library/golang:1.21.1           0.4s
 => [auth] library/alpine:pull token for registry-1.docker.io              0.0s
 => [auth] library/golang:pull token for registry-1.docker.io              0.0s
 => [internal] load .dockerignore                                          0.0s
 => => transferring context: 2B                                            0.0s
 => [builder 1/5] FROM docker.io/library/golang:1.21.1@sha256:cffaba795c3  0.0s
 => => resolve docker.io/library/golang:1.21.1@sha256:cffaba795c36f07e372  0.0s
 => [run 1/4] FROM docker.io/library/alpine:3.14@sha256:0f2d5c38dd7a4f4f7  0.0s
 => => resolve docker.io/library/alpine:3.14@sha256:0f2d5c38dd7a4f4f733e6  0.0s
 => CACHED [builder 2/5] RUN apt update && apt install -y strace           0.0s
 => CACHED [builder 3/5] RUN mkdir -p /go/src/github.com/openshift &&      0.0s
 => ERROR [builder 4/5] RUN strace -f -tt date                             0.1s
 => CACHED [run 2/4] WORKDIR /app                                          0.0s
 => CACHED [run 3/4] RUN apk update --no-cache && apk add ca-certificates  0.0s
------
 > [builder 4/5] RUN strace -f -tt date:
0.081 strace: test_ptrace_get_syscall_info: PTRACE_TRACEME: Function not implemented
0.084 strace: ptrace(PTRACE_TRACEME, ...): Function not implemented
0.087 strace: PTRACE_SETOPTIONS: Function not implemented
0.087 strace: detach: waitpid(14): No child processes
0.087 strace: Process 14 detached
------
Dockerfile:11
--------------------
   9 |         git clone https://github.com/openshift/eventrouter
  10 |     
  11 | >>> RUN strace -f -tt date
  12 |     
  13 |     RUN cd /go/src/github.com/openshift/eventrouter && go mod tidy && CGO_ENABLED=0 strace -f -tt go build .
--------------------
ERROR: failed to solve: process "/dev/.buildkit_qemu_emulator /bin/sh -c strace -f -tt date" did not complete successfully: exit code: 1
mspreitz@ubu22b:~/test4$ 
francostellari commented 9 months ago

@MikeSpreitzer my buildx also fails on a JetsonNano arm64

MikeSpreitzer commented 9 months ago

@francostellari : so, to be clear: the Linux that you are using is installed directly on your JetsonNano hardware --- so the QEMU that I am using to run Linux on my Mac is not involved in your scenario.

MikeSpreitzer commented 9 months ago

@tonistiigi : when you said it looks like a QEMU error, were you thinking of the QEMU that I am using to run Linux on my Mac, or is there another QEMU here somewhere?

francostellari commented 9 months ago

@MikeSpreitzer yes Ubuntu arm64 Is the host is running on the jn. Docker is failing to build amd64.

MikeSpreitzer commented 9 months ago

@tonistiigi: is there a QEMU involved in @francostellari 's scenario? Should he apply strace somewhere to help debug?

MikeSpreitzer commented 9 months ago

I did a little RTFM and found https://docs.docker.com/build/building/multi-platform/#qemu , so yes, I think there is a QEMU in @francostellari 's scenario.

MikeSpreitzer commented 9 months ago

@tonistiigi: how could @francostellari get more than 2MB of log from the docker builder if he applies strace to the go build command?

@francostellari: Here is a Dockerfile that applies strace to the go build:

FROM golang:1.21.1 as builder

RUN apt update -y && apt install -y strace

RUN mkdir -p /go/src/github.com/openshift && \
    cd /go/src/github.com/openshift && \
    git clone https://github.com/openshift/eventrouter

RUN cd /go/src/github.com/openshift/eventrouter && go mod tidy && CGO_ENABLED=0 strace -f -tt go build .

Here is how I created a builder that I used to test:

docker buildx create --name test4 --platform linux/arm64,linux/amd64,linux/s390x

Here I am examining it:

mspreitz@ubu22b:~/test4$ docker buildx ls
NAME/NODE DRIVER/ENDPOINT             STATUS  BUILDKIT             PLATFORMS
test4     docker-container                                         
  test40  unix:///var/run/docker.sock running v0.12.2              linux/arm64*, linux/amd64*, linux/s390x*
default * docker                                                   
  default default                     running v0.11.6+616c3f613b54 linux/arm64

Here is a command that you could try, but it will only get you the first 2MB of log from the build.

docker buildx --builder test4 build -t quay.io/francostellari/test4:latest --platform linux/amd64 --push .
francostellari commented 9 months ago

@MikeSpreitzer running docker from a host os with Ubuntu Arm64 (Ubuntu 20.04.5 LTS aarch64), buildx cross platform builds (specifically for linux/amd64) succeed/fail depending on the based image.

Given docker buildx create --name builder, the the command docker buildx build --platform "linux/amd64" --no-cache -t test .

Fails on RUN for Dockerfile:

FROM docker.io/redhat/ubi9
RUN ls

Passes for Dockerfile:

FROM docker.io/golang:1.21.1
RUN ls

or Dockerfile:

FROM docker.io/ubuntu
RUN ls
mheese commented 9 months ago

Just ran into this on Arch Linux today. My docker/buildkit configuration is the following:

$ docker version
Client:
 Version:           24.0.5
 API version:       1.43
 Go version:        go1.20.6
 Git commit:        ced0996600
 Built:             Wed Jul 26 21:44:58 2023
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          24.0.5
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.6
  Git commit:       a61e2b4c9c
  Built:            Wed Jul 26 21:44:58 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.7.6
  GitCommit:        091922f03c2762540fd057fba91260237ff86acb.m
 runc:
  Version:          1.1.9
  GitCommit:        
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

$ docker buildx version
github.com/docker/buildx 0.11.2 9872040b6626fb7d87ef7296fd5b832e8cc2ad17

$ docker buildx ls
NAME/NODE DRIVER/ENDPOINT STATUS  BUILDKIT             PLATFORMS
default * docker                                       
  default default         running v0.11.6+0a15675913b7 linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/arm64, linux/riscv64, linux/ppc64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6

The base image that I was working with was ubuntu:22.04 (sha256:a85dde69d848ee95ed04866aa225cbc435845abf14b1dbb42cdbe55fc72dc137) for arm64.

I played going up/down on both docker and buildx versions, but that did not resolve anything. I then downgraded the following two qemu packages which are used by buildx to 8.0.0 from 8.1.1, and then things started working again.

qemu-user-static-8.0.0-1-x86_64.pkg.tar.zst
qemu-user-static-binfmt-8.0.0-1-x86_64.pkg.tar.zst

I hope that helps.

MikeSpreitzer commented 9 months ago

@francostellari: you have gotten side-tracked onto a different issue. It would be clearer if you showed what goes wrong, but there are a couple of sharp edges in the neighborhood. One seems to be about getting docker buildx build to use QEMU with the right ISA and a base image for the right ISA. Sometimes I seem to need to explicitly prepare a builder for my intended platforms, and sometimes I do not. I tried to avoid that quagmire by explicitly preparing a builder for my postings in this issue. The other is that RedHat's ubi9 x86 images have upped their ISA requirement so that the baseline is no longer good enough; the bundle commonly known as "x86-64-v2" is good enough.

You will note that in my report above I showed how I created the builder involved and showed a docker buildx ls listing. I explicitly requested that particular builder in my docker buildx --builder $THERIGHTONE build command. If I recall correctly, if you omit --builder then the one that will be used depends on some state that you did not display; docker buildx ls would display that state.

To succeed with redhat/ubi9 you need to build for the Docker "platform" linux/amd64/v2 (or higher).

Once you get past that, the question in this issue is why the go compiler crashes when RUN in the Dockerfile when doing a cross-platform build, building for Intel while the physical hardware is ARM.

MikeSpreitzer commented 9 months ago

@mheese : given the confusion just introduced here, I am not sure which issue you meant when you said "Just ran into this".

mheese commented 9 months ago

@MikeSpreitzer I might not have read this thread carefully enough, but it helped me to solve my issue, so I wrote something which I thought is helpful, however, I'm just realizing that I'm off-track with my answer, so I apologize :)

in a nutshell: I also ran into a segmentation fault with the go compiler (and other tools) while cross-compiling.

However, my host is amd64, the target arm64. As you can see above I'm not preparing a specific builder, but I'm just using qemu-static on my host system. I noticed that I had to downgrade said qemu-static to 8.0.0 from 8.1.1 in order for my Ubuntu 22.04 docker containers not to segfault.

If it's worth something to anybody, I'm happy to run some other commands / try some other combinations, but it looks like I'm off track anyways. EIther way, happy to help if I can.

MikeSpreitzer commented 9 months ago

@mheese: thanks for the clarification. BTW, how does one control the QEMU version that gets used by docker buildx ?

MikeSpreitzer commented 9 months ago

FYI, regarding the image vs. ISA problem, the following is what I see when I use docker buildx inside my UTM VM on my Apple Silicon MacBook, guest OS Ubuntu 22.04.3. I am testing in a directory that holds nothing but the first Dockerfile from Franco's latest comment above (https://github.com/docker/buildx/issues/2028#issuecomment-1733724300) .

mspreitz@ubu22b:~/test-fs1$ docker buildx ls
NAME/NODE      DRIVER/ENDPOINT             STATUS  BUILDKIT             PLATFORMS
...
default *      docker                                                   
  default      default                     running v0.11.6+616c3f613b54 linux/arm64

mspreitz@ubu22b:~/test-fs1$ docker buildx build -t test --platform linux/amd64 .
[+] Building 10.8s (5/5) FINISHED                                                  docker:default
 => [internal] load build definition from Dockerfile                                         0.0s
 => => transferring dockerfile: 71B                                                          0.0s
 => [internal] load .dockerignore                                                            0.0s
 => => transferring context: 2B                                                              0.0s
 => [internal] load metadata for docker.io/redhat/ubi9:latest                                0.4s
 => [1/2] FROM docker.io/redhat/ubi9@sha256:351ed8b24d440c348486efd99587046e88bb966890a920  10.0s
 => => resolve docker.io/redhat/ubi9@sha256:351ed8b24d440c348486efd99587046e88bb966890a9207  0.0s
 => => sha256:351ed8b24d440c348486efd99587046e88bb966890a9207a5851d3a34a4dd 1.47kB / 1.47kB  0.0s
 => => sha256:bd30f546dfb78ef0fb7789376afd22671319007af473f03370dafab34302c857 429B / 429B   0.0s
 => => sha256:9f43f297e77bc6937de12d4e90ed5dc679bd3b9c7a481068d2a840f5244d4 6.42kB / 6.42kB  0.0s
 => => sha256:3b7adf049118244599c2f433c32bb40ea46462b457d9ca01ab066462c5f 78.05MB / 78.05MB  5.6s
 => => extracting sha256:3b7adf049118244599c2f433c32bb40ea46462b457d9ca01ab066462c5f38561    4.3s
 => ERROR [2/2] RUN ls                                                                       0.3s
------                                                                                            
 > [2/2] RUN ls:
0.152 exec /bin/sh: exec format error
------
Dockerfile:2
--------------------
   1 |     FROM docker.io/redhat/ubi9
   2 | >>> RUN ls
   3 |     
--------------------
ERROR: failed to solve: process "/bin/sh -c ls" did not complete successfully: exit code: 1

I tried adding /v2 to the build command, along with no-cache, but it hit the cache anyway:

mspreitz@ubu22b:~/test-fs1$ docker buildx build -t test --platform linux/amd64/v2 --no-cache .
[+] Building 0.4s (5/5) FINISHED                                                   docker:default
 => [internal] load .dockerignore                                                            0.0s
 => => transferring context: 2B                                                              0.0s
 => [internal] load build definition from Dockerfile                                         0.0s
 => => transferring dockerfile: 71B                                                          0.0s
 => [internal] load metadata for docker.io/redhat/ubi9:latest                                0.2s
 => CACHED [1/2] FROM docker.io/redhat/ubi9@sha256:351ed8b24d440c348486efd99587046e88bb9668  0.0s
 => => resolve docker.io/redhat/ubi9@sha256:351ed8b24d440c348486efd99587046e88bb966890a9207  0.0s
 => ERROR [2/2] RUN ls                                                                       0.2s
------
 > [2/2] RUN ls:
0.157 exec /bin/sh: exec format error
------
Dockerfile:2
--------------------
   1 |     FROM docker.io/redhat/ubi9
   2 | >>> RUN ls
   3 |     
--------------------
ERROR: failed to solve: process "/bin/sh -c ls" did not complete successfully: exit code: 1

So I made a new builder, and put /v2 in the new builder's platforms, and voila!

mspreitz@ubu22b:~/test-fs1$ docker buildx create --name franco --platform linux/amd64/v2
franco

mspreitz@ubu22b:~/test-fs1$ docker buildx --builder franco build -t test --platform linux/amd64/v2 --no-cache .
[+] Building 9.4s (7/7) FINISHED                                          docker-container:franco
 => [internal] booting buildkit                                                              1.2s
 => => pulling image moby/buildkit:buildx-stable-1                                           0.8s
 => => creating container buildx_buildkit_franco0                                            0.5s
 => [internal] load build definition from Dockerfile                                         0.0s
 => => transferring dockerfile: 71B                                                          0.0s
 => [internal] load metadata for docker.io/redhat/ubi9:latest                                0.7s
 => [auth] redhat/ubi9:pull token for registry-1.docker.io                                   0.0s
 => [internal] load .dockerignore                                                            0.0s
 => => transferring context: 2B                                                              0.0s
 => [1/2] FROM docker.io/redhat/ubi9:latest@sha256:351ed8b24d440c348486efd99587046e88bb9668  7.0s
 => => resolve docker.io/redhat/ubi9:latest@sha256:351ed8b24d440c348486efd99587046e88bb9668  0.0s
 => => sha256:3b7adf049118244599c2f433c32bb40ea46462b457d9ca01ab066462c5f 78.05MB / 78.05MB  5.6s
 => => extracting sha256:3b7adf049118244599c2f433c32bb40ea46462b457d9ca01ab066462c5f38561    1.4s
 => [2/2] RUN ls                                                                             0.2s
WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load  

But doing the same thing for the Dockerfile I care about does not get past the go compiler crashing.

francostellari commented 9 months ago

@MikeSpreitzer I followed your suggestion and adding /v2 to the buildx build worked from arm64. I do not need to specify the --platform in the buildx create command. It's enough to add /v2 to the buildx build --platform command. I have updated https://github.com/francostellari/kubestellar/blob/user/user/container/Makefile#L18 for https://github.com/kubestellar/kubestellar/pull/1037

MikeSpreitzer commented 9 months ago

@francostellari: thank you. Can you please display a success for us, including Dockerfile, docker buildx ls, and the docker buildx build command and output?

MikeSpreitzer commented 9 months ago

@tonistiigi , @mheese : how would I query or control which QEMU versions get used?

I just did the following inside my Ubuntu VM on MacOS, but it is not that specific.

mspreitz@ubu22b:~$ docker run --privileged --rm tonistiigi/binfmt
Unable to find image 'tonistiigi/binfmt:latest' locally
latest: Pulling from tonistiigi/binfmt
6dda554f4baf: Pull complete 
2b0720d7a501: Pull complete 
Digest: sha256:66e11bea77a5ea9d6f0fe79b57cd2b189b5d15b93a2bdb925be22949232e4e55
Status: Downloaded newer image for tonistiigi/binfmt:latest
{
  "supported": [
    "linux/arm64"
  ],
  "emulators": [
    "python3.10"
  ]
}

@francostellari : what does the above get you?

I tried adding amd64/v2, but no luck:

mspreitz@ubu22b:~$ docker run --privileged --rm tonistiigi/binfmt --install amd64/v2
installing: v2 unsupported architecture: v2
{
  "supported": [
    "linux/arm64"
  ],
  "emulators": [
    "python3.10"
  ]
}
MikeSpreitzer commented 9 months ago

Some more clues from inside my Ubuntu VM on my MacBook are below; @francostellari what do you see? @tonistiigi, what should I see in order to succeed?

root@ubu22b:/proc/sys/fs# ls -l binfmt_misc/
total 0
-rw-r--r-- 1 root root 0 Sep 26 16:29 python3.10
-rw-r--r-- 1 root root 0 Sep 26 16:36 qemu-x86_64
--w------- 1 root root 0 Sep 26 16:29 register
-rw-r--r-- 1 root root 0 Sep 26 16:29 status

root@ubu22b:/proc/sys/fs# cat binfmt_misc/status
enabled
root@ubu22b:/proc/sys/fs# cat binfmt_misc/qemu-x86_64 
enabled
interpreter /usr/bin/qemu-x86_64
flags: POCF
offset 0
magic 7f454c4602010100000000000000000002003e00
mask fffffffffffefe00fffffffffffffffffeffffff

BTW, while my docker buildx build is running the go compiler that will eventually crash, my Ubuntu ps axlwww reports lines like the following.

4     0    7173    7166  20   0 1406248 78628 -     Sl   ?          0:03 /usr/bin/qemu-x86_64 /usr/local/go/bin/go go build .
0     0    8465    7173  20   0 1410912 95740 -     Rl   ?          0:00 /usr/bin/qemu-x86_64 /usr/local/go/pkg/tool/linux_amd64/compile /usr/local/go/pkg/tool/linux_amd64/compile -o /tmp/go-build2862751252/b035/_pkg_.a -trimpath /tmp/go-build2862751252/b035=> -p os -std -buildid TrC-6nR2tjEuD16bEyDK/TrC-6nR2tjEuD16bEyDK -goversion go1.21.1 ...
MikeSpreitzer commented 9 months ago

Looking at https://github.com/multiarch/qemu-user-static#multiarchqemu-user-static-images and the available tags on DockerHub, I would expect the following to work, but it does not. Also: why are all the images on DockerHub for linux/amd64 --- does multiarch/qemu-user-static only apply when the host is x86_64?

mspreitz@ubu22b:~/test4$ docker run --rm --privileged multiarch/qemu-user-static:arm-x86_64 --reset -p yes
Unable to find image 'multiarch/qemu-user-static:arm-x86_64' locally
docker: Error response from daemon: manifest for multiarch/qemu-user-static:arm-x86_64 not found: manifest unknown: manifest unknown.
MikeSpreitzer commented 9 months ago

It only goes one way?

mspreitz@ubu22b:~/test4$ docker pull docker.io/multiarch/qemu-user-static:arm-x86_64
Error response from daemon: manifest for multiarch/qemu-user-static:arm-x86_64 not found: manifest unknown: manifest unknown

mspreitz@ubu22b:~/test4$ docker pull docker.io/multiarch/qemu-user-static:x86_64-arm
x86_64-arm: Pulling from multiarch/qemu-user-static
ba17f2cc646e: Pull complete 
Digest: sha256:339b9e5822de7ed6412c985fee69d578af03262e223971e6fe0adaa31386df31
Status: Downloaded newer image for multiarch/qemu-user-static:x86_64-arm
docker.io/multiarch/qemu-user-static:x86_64-arm

Oh, duh, I found https://github.com/multiarch/qemu-user-static#supported-host-architectures .

So that path is out. But docker buildx build should be able to succeed with its use of plain QEMU, right?

francostellari commented 9 months ago

@MikeSpreitzer here is my experience on Ubuntu 20.04.5 LTS aarch64 host os with Docker 24.0.6 and buildx github.com/docker/buildx v0.10.2 00ed17df6d20f3ca4553d45789264cdb78506e5f:

The test Dockerfile is:

FROM docker.io/redhat/ubi9
RUN ls

If I create a buildx builder with linux/amd64 only :

docker buildx create --name amd64 --platform linux/amd64 --use

Then this fails fail:

$ docker buildx build --platform "linux/amd64" --no-cache -t test .
WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
[+] Building 2.7s (5/5) FINISHED
 => [internal] load build definition from Dockerfile                                                                0.1s
 => => transferring dockerfile: 71B                                                                                 0.0s
 => [internal] load metadata for docker.io/redhat/ubi9:latest                                                       1.0s
 => [internal] load .dockerignore                                                                                   0.1s
 => => transferring context: 2B                                                                                     0.0s
 => CACHED [1/2] FROM docker.io/redhat/ubi9:latest@sha256:351ed8b24d440c348486efd99587046e88bb966890a9207a5851d3a3  0.3s
 => => resolve docker.io/redhat/ubi9:latest@sha256:351ed8b24d440c348486efd99587046e88bb966890a9207a5851d3a34a4dd34  0.3s
 => ERROR [2/2] RUN ls                                                                                              0.7s
------
 > [2/2] RUN ls:
#0 0.456 Fatal glibc error: CPU does not support x86-64-v2
------
Dockerfile:2
--------------------
   1 |     FROM docker.io/redhat/ubi9
   2 | >>> RUN ls
   3 |
--------------------
ERROR: failed to solve: process "/bin/sh -c ls" did not complete successfully: exit code: 127

But this succeds:

$ docker buildx build --platform "linux/amd64/v2" --no-cache -t test .
WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
[+] Building 3.0s (5/5) FINISHED
 => [internal] load build definition from Dockerfile                                                                0.0s
 => => transferring dockerfile: 71B                                                                                 0.0s
 => [internal] load metadata for docker.io/redhat/ubi9:latest                                                       0.6s
 => [internal] load .dockerignore                                                                                   0.0s
 => => transferring context: 2B                                                                                     0.0s
 => CACHED [1/2] FROM docker.io/redhat/ubi9:latest@sha256:351ed8b24d440c348486efd99587046e88bb966890a9207a5851d3a3  0.3s
 => => resolve docker.io/redhat/ubi9:latest@sha256:351ed8b24d440c348486efd99587046e88bb966890a9207a5851d3a34a4dd34  0.3s
 => [2/2] RUN ls 

If I create a buildx builder with both linux/amd64 and linux/amd64/v2:

$ docker buildx create --name amd64 --platform "linux/amd64,linux/amd64/v2" --use
amd64

The situation is unchanged:

$ docker buildx build --platform "linux/amd64" --no-cache -t test .
WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
[+] Building 2.5s (5/5) FINISHED
 => [internal] load build definition from Dockerfile                                                                0.1s
 => => transferring dockerfile: 71B                                                                                 0.0s
 => [internal] load metadata for docker.io/redhat/ubi9:latest                                                       0.7s
 => [internal] load .dockerignore                                                                                   0.1s
 => => transferring context: 2B                                                                                     0.0s
 => CACHED [1/2] FROM docker.io/redhat/ubi9:latest@sha256:351ed8b24d440c348486efd99587046e88bb966890a9207a5851d3a3  0.3s
 => => resolve docker.io/redhat/ubi9:latest@sha256:351ed8b24d440c348486efd99587046e88bb966890a9207a5851d3a34a4dd34  0.3s
 => ERROR [2/2] RUN ls                                                                                              0.8s
------
 > [2/2] RUN ls:
#0 0.600 Fatal glibc error: CPU does not support x86-64-v2
------
Dockerfile:2
--------------------
   1 |     FROM docker.io/redhat/ubi9
   2 | >>> RUN ls
   3 |
--------------------
ERROR: failed to solve: process "/bin/sh -c ls" did not complete successfully: exit code: 127

$ docker buildx build --platform "linux/amd64/v2" --no-cache -t test .
WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
[+] Building 2.5s (5/5) FINISHED
 => [internal] load build definition from Dockerfile                                                                0.1s
 => => transferring dockerfile: 71B                                                                                 0.0s
 => [internal] load metadata for docker.io/redhat/ubi9:latest                                                       0.6s
 => [internal] load .dockerignore                                                                                   0.0s
 => => transferring context: 2B                                                                                     0.0s
 => CACHED [1/2] FROM docker.io/redhat/ubi9:latest@sha256:351ed8b24d440c348486efd99587046e88bb966890a9207a5851d3a3  0.2s
 => => resolve docker.io/redhat/ubi9:latest@sha256:351ed8b24d440c348486efd99587046e88bb966890a9207a5851d3a34a4dd34  0.2s
 => [2/2] RUN ls    
AlekSi commented 9 months ago

Here is a simple reproducer:

FROM golang
RUN go install -v std
docker buildx create --name=test --driver=docker-container
docker buildx build --builder=test --platform=linux/amd64,linux/arm64 .

Go compiler crashes with:

go: error obtaining buildID for go tool compile: signal: segmentation fault

or

internal/testpty: /usr/local/go/pkg/tool/linux_amd64/cgo: signal: segmentation fault

or some other similar error.

That started to happen only recently, after some Docker Desktop updates, I think.

ChinaskiJr commented 8 months ago

Same here when compiling PHP extension with pecl. I get segmentation faults on random lines during the build for amd64, arm is compiling fine. (I'm on a M1 mac book) The command that fails :

/dev/.buildkit_qemu_emulator /bin/sh -c docker-php-ext-install pdo_pgsql     && docker-php-ext-install intl bcmath     && docker-php-ext-configure gd --with-jpeg=/usr/include/ --with-freetype=/usr/include/freetype2 && docker-php-ext-install gd     && pecl install xdebug-${XDEBUG_VERSION}     && pecl install pcov-${PCOV_VERSION} && docker-php-ext-enable pcov     && docker-php-ext-install zip
pjar commented 7 months ago

Getting the same. Any way to use downgraded buildx as workaround for now?

n0rt0nthec4t commented 7 months ago

and a "me too" trying to cross-compile ffmpeg/x264 code on docker desktop running on MacBook M1 to both arm64 and amd64

tonistiigi commented 7 months ago

Any way to use downgraded buildx as workaround for now?

If you uninstall+install emulators with https://github.com/tonistiigi/binfmt#installing-emulators you can pick any version you like https://hub.docker.com/r/tonistiigi/binfmt/tags , including newer versions than default in current release. https://hub.docker.com/r/tonistiigi/binfmt/tags

If you find a case where qemu functionality has regressed in between versions then make a new issue demonstrating that case so we can track it properly.

tonistiigi commented 7 months ago

Here is a command that you could try, but it will only get you the first 2MB of log from the build.

You can send strace output to a file

ENV QEMU_STRACE=1
RUN .. 2> /strace.log

or use tee to get both. Or if you just want to see last logs quickly you can just let the command error (eg. add && stop) that will print you 200K of the last logs.

MikeSpreitzer commented 7 months ago

@tonistiigi : QEMU does not support strace

clubanderson commented 7 months ago

happening to me too

tonistiigi commented 7 months ago

@MikeSpreitzer Sorry, I mixed it up. You need to use the qemu builtin tracing, not strace for this. Define ENV QEMU_STRACE=1 in your dockerfile before running the process.

pjar commented 7 months ago

Today's Docker desktop for Mac update seems to fix the issue for me.

Edit: my current version of Docker Desktop for Mac: 4.26.0 (130397)

n0rt0nthec4t commented 7 months ago

Today's Docker desktop for Mac update seems to fix the issue for me.

Yea, seems to be ok now. Shame linux arm v7 builds take ageeeeeeeeeeeeeeeeeees