rancher-sandbox / rancher-desktop

Container Management and Kubernetes on the Desktop
https://rancherdesktop.io
Apache License 2.0
5.84k stars 272 forks source link

Cross compiling applications in linux/amd64 container on M1 Macbooks is broken #5755

Open TheFriendlyCoder opened 11 months ago

TheFriendlyCoder commented 11 months ago

Actual Behavior

In our case, when we build a GOlang based app inside a linux/amd64 Docker container, and compile it for the amd64 architecture, we get regular segfaults. It doesn't happen on every build, and the larger / more complex the build, the more often the segfaults occur.

Steps to Reproduce

I've reproduced the problem with a simple Dockerfile:

FROM golang:1.19 as builder
COPY hello.go .
RUN go build hello.go

and a simple hello.go sample script:

package main

import "fmt"

func main() {
    fmt.Println("hello world")
}

And then run the following command:

DOCKER_DEFAULT_PLATFORM=linux/amd64 docker build --no-cache .

or the equivalent

docker build --platform=linux/amd64 --no-cache .

NOTE: Sometimes you may have to run the build command several times to see the error, but usually it'll appear at least once every 2-5 times.

Result

we get errors that look like this:

 > [3/3] RUN go build hello.go:
2.355 runtime/internal/sys: /usr/local/go/pkg/tool/linux_amd64/compile: signal: segmentation fault (core dumped)
------
Dockerfile:3
--------------------
   8 |     #RUN go build hello.go
   9 |     #RUN go build hello.go
  10 | >>> RUN go build hello.go
--------------------
ERROR: failed to solve: process "/bin/sh -c go build hello.go" did not complete successfully: exit code: 1

Expected Behavior

Trivial builds like this should succeed without error

Additional Information

These problems don't exist when we perform the exact same steps using the Docker Desktop application - only Rancher Desktop.

Also, to help expedite the testing and ensure there aren't any cached files interfering with the test builds, I wrote the following bash script that can be run to delete all intermedia files produced by RD or Docker, after which you can reinstall the Rancher Desktop app - with all default settings (except that I have the K8s subsystem disabled - we don't need that).

#!/bin/bash
docker context use default
docker buildx use default
rdctl factory-reset --remove-kubernetes-cache
sleep 5
sudo rm -rf /Applications/Rancher\ Desktop.app/
rm -rf ~/.docker
sudo rm -rf ~/Library/Application\ Support/rancher-desktop
sudo rm -rf /var/run/docker.sock

Rancher Desktop Version

1.10.0

Rancher Desktop K8s Version

n/a

Which container engine are you using?

moby (docker cli)

What operating system are you using?

macOS

Operating System / Build Version

MacOS Sonoma Version 14.0 (23A344)

What CPU architecture are you using?

arm64 (Apple Silicon)

Linux only: what package format did you use to install Rancher Desktop?

None

Windows User Only

No response

TheFriendlyCoder commented 11 months ago

Here's a similar report from last year: https://github.com/rancher-sandbox/rancher-desktop/issues/2109

TheFriendlyCoder commented 11 months ago

I should also mention that I tried inverting several different settings in the Rancher desktop app and have the same results. Some of my test cases included:

TheFriendlyCoder commented 11 months ago

There's an anecdote on this redit thread suggesting that this problem may be fixed in a newer version of Docker Server (v24.0) but I noticed that RD is currently shipping with the latest version of moby (v23.x) so maybe the fix isn't available yet.

TheFriendlyCoder commented 11 months ago

I also found a similar / the same defect reported in the docker/buildx repository here as well.

TheFriendlyCoder commented 11 months ago

Oh, and for good measure I tried installing the x86 build of RD to see if the problem persisted there, but the RD backend service refuses to launch on an M1 Mac, so I was only able to test the aarch64 build.

segevfiner commented 10 months ago

Are the qemu binaries used in Rancher Desktop up-to-date? This smells like a qemu bug. I think this also used to happen with Docker Desktop. Not sure if it still happens there.

TheFriendlyCoder commented 10 months ago

I have the latest Rancher Desktop updates, plus the problem is reproducible using the Apple VZ virtualization as well as QEMU.

segevfiner commented 10 months ago

The platform of Docker, flag uses qemu-user binaries inside the VM, it doesn't matter what the virtualization/hypervisor of the VM is. Maybe we can try replacing/upgrading them at runtime using https://github.com/tonistiigi/binfmt or other similar images and see if this helps.

billimek commented 10 months ago

Are you having this issue when running Rancher Desktop VM machine type as the Apple Virtualization framework instead of QEMU?

image

TheFriendlyCoder commented 10 months ago

Yes.

v-vostretsov commented 10 months ago

having the same issue Screenshot 2023-11-03 at 16 16 08 Screenshot 2023-11-03 at 16 22 12

wills-feng commented 10 months ago

having the same issue

mattfarina commented 9 months ago

I just tried setting the VM type to VZ and enabled using Rosetta. That appeared to allow me to cross-compile to amd64. I'm on macOS 14.1.1.

Using skopeo against the docker socket to pull details for the image I see...

    "Architecture": "amd64",
    "Os": "linux",
TheFriendlyCoder commented 9 months ago

VZ / Rosetta had no impact on this behavior for me... and in fact it caused additional problems than running without