Segfault building rust docs on Ubuntu 22.04 (amd64) on macOS host

jaraco commented 1 month ago

Description

In https://github.com/rust-lang/rust/issues/125430, I reported an issue where rust fails to install on a Linux amd64 container over Docker on macOS 14.5 on ARM. During the docs build, a Segmentation fault occurs. The issue only occurs with --platform linux/amd64. Analysis in that other issue suggests a root cause in the kernel or Docker.

Reproduce

docker run --platform linux/amd64 ubuntu:noble bash -c "apt update; apt install -y wget; wget https://sh.rustup.rs -O - | sh -s -- -y"

Expected behavior

The build should complete successfully as it does in other environments.

docker version

@ docker version
Client:
 Cloud integration: v1.0.35+desktop.13
 Version:           26.1.1
 API version:       1.45
 Go version:        go1.21.9
 Git commit:        4cf5afa
 Built:             Tue Apr 30 11:44:56 2024
 OS/Arch:           darwin/arm64
 Context:           desktop-linux

Server: Docker Desktop 4.30.0 (149282)
 Engine:
  Version:          26.1.1
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.9
  Git commit:       ac2de55
  Built:            Tue Apr 30 11:48:04 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.31
  GitCommit:        e377cd56a71523140ca6ae87e30244719194a521
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

@ docker info
Client:
 Version:    26.1.1
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.0-desktop.1
    Path:     /Users/jaraco/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.27.0-desktop.2
    Path:     /Users/jaraco/.docker/cli-plugins/docker-compose
  debug: Get a shell into any image or container (Docker Inc.)
    Version:  0.0.29
    Path:     /Users/jaraco/.docker/cli-plugins/docker-debug
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.2
    Path:     /Users/jaraco/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.23
    Path:     /Users/jaraco/.docker/cli-plugins/docker-extension
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
    Version:  v1.0.4
    Path:     /Users/jaraco/.docker/cli-plugins/docker-feedback
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v1.1.0
    Path:     /Users/jaraco/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/jaraco/.docker/cli-plugins/docker-sbom
  scout: Docker Scout (Docker Inc.)
    Version:  v1.8.0
    Path:     /Users/jaraco/.docker/cli-plugins/docker-scout

Server:
 Containers: 88
  Running: 1
  Paused: 0
  Stopped: 87
 Images: 22
 Server Version: 26.1.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: e377cd56a71523140ca6ae87e30244719194a521
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
  cgroupns
 Kernel Version: 6.6.26-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 12
 Total Memory: 7.657GiB
 Name: docker-desktop
 ID: 848ca16c-b358-46de-820c-591a2cf341f8
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Labels:
  com.docker.desktop.address=unix:///Users/jaraco/Library/Containers/com.docker.docker/Data/docker-cli.sock
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: daemon is not using the default seccomp profile

Diagnostics ID

31BF3D41-B6F8-43A0-8FBC-2021581F5862/20240526183503

Additional Info

No response

jaraco commented 1 month ago

Downgrading to 4.28, the problem goes away. The issue exists on 4.29 also.

conradwt commented 1 month ago

@jaraco The above worked for me without modification on Docker Desktop for Mac v4.30.0.

jaraco commented 1 month ago

@jaraco The above worked for me without modification on Docker Desktop for Mac v4.30.0.

Are you using the same macOS version and architecture?

jaraco commented 1 month ago

Interestingly, after upgrading from 4.28.0 to 4.30.0, the problem seems to be gone... or not. Okay, let me capture some of the steps. From earlier,

Confirmed the command failed on 4.30.0.
Downgraded to 4.29.0, confirmed the command failed again.
Downgraded to 4.28.0, confirmed the command succeeded.
Upgraded back to 4.30.0 (using the self installer). Ran the command and it succeeded :confused:
Went to jaraco/multipy-tox and ran docker buildx build --platform linux/amd64 -t jaraco/multipy-tox .. Command failed after info: installing component 'rust-docs' (same Segmentation fault).
Ran the aforementioned run command again and it failed.

Now I'm beginning to wonder if it's an intermittent issue, so I ran the command again, but this time, it failed with a different error (a panic).

I ran it 5 more times. The first three times I got the Segmentation fault. The fourth time, the process hung after 'downloading component rustc'. The fifth time, the command succeeded. For good measure, I ran it one more time and got another Segmentation fault.

So it does appear as if whatever is happening is intermittent, which means it may also be sensitive to the host hardware and OS. I'm on a 2023 Macbook Pro 14" (M3 Pro, 36GB RAM).

I'd worry the issue was unique to my environment, but others in the rust bug were able to replicate it, so we know it's not just me.

conradwt commented 1 month ago

@jaraco The above worked for me without modification on Docker Desktop for Mac v4.30.0.

Are you using the same macOS version and architecture?

The host OS is macOS 14.5 Apple Silicon and the guest OS is Ubuntu (Noble) X86. Next, I ran it 20 times in a loop, and no errors here.

Next, my default builder, desktop-linux, after Docker Desktop for Mac installation looks like the following:

➜ docker buildx ls                                                  
NAME/NODE           DRIVER/ENDPOINT     STATUS    BUILDKIT   PLATFORMS
default             docker                                   
 \_ default          \_ default         running   v0.13.2    linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/mips64le, linux/mips64
desktop-linux*      docker                                   
 \_ desktop-linux    \_ desktop-linux   running   v0.13.2    linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/mips64le, linux/mips64

jaraco commented 1 month ago

I discovered that disabling Rosetta

suppresses the error. I disabled that setting, restarted the engine, then ran the repro 5 times without failure. I then re-enabled Rosetta, restarted the engine, and the repro elicited the error twice in a row. After running the command, here's what I see for buildx ls:

NAME/NODE           DRIVER/ENDPOINT     STATUS    BUILDKIT   PLATFORMS
multi*              docker-container                         
 \_ multi0           \_ desktop-linux   running   v0.13.2    linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
default             docker                                   
 \_ default          \_ default         running   v0.13.2    linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
desktop-linux       docker                                   
 \_ desktop-linux    \_ desktop-linux   running   v0.13.2    linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6

Next, I installed Docker on my 2020 Mac mini (with M1 chip, running macOS 14.4.1) and ran the command and it doesn't reproduce there (succeeded twice), so it does appear as if the issue is sensitive to the speed or number of cores or actual silicon layout of the M3 macbook. Maybe Rosetta is exercising some unique features of these newer chips. Or maybe Rosetta actually has a bug on a newer chip.

jaraco commented 1 month ago

Oh, wow. I updated my mac mini from macOS 14.4.1 to 14.5, and now it is also replicating the failure. So that excludes the concerns about chip speed and chip generation.

conradwt commented 1 month ago

@jaraco I have Rosetta 2 enabled as you can see from the image:

Screenshot 2024-05-26 at 10 41 51 PM

Next, here's a gist of a complete run of your command:

Click to toggle contents of run

``` ➜ docker run --platform linux/amd64 ubuntu:noble bash -c "apt update; apt install -y wget; wget https://sh.rustup.rs -O - | sh -s -- -y" WARNING: apt does not have a stable CLI interface. Use with caution in scripts. Get:1 http://archive.ubuntu.com/ubuntu noble InRelease [256 kB] Get:2 http://security.ubuntu.com/ubuntu noble-security InRelease [89.7 kB] Get:3 http://archive.ubuntu.com/ubuntu noble-updates InRelease [89.7 kB] Get:4 http://archive.ubuntu.com/ubuntu noble-backports InRelease [89.7 kB] Get:5 http://archive.ubuntu.com/ubuntu noble/restricted amd64 Packages [117 kB] Get:6 http://archive.ubuntu.com/ubuntu noble/multiverse amd64 Packages [331 kB] Get:7 http://archive.ubuntu.com/ubuntu noble/universe amd64 Packages [19.3 MB] Get:8 http://archive.ubuntu.com/ubuntu noble/main amd64 Packages [1808 kB] Get:9 http://archive.ubuntu.com/ubuntu noble-updates/universe amd64 Packages [41.7 kB] Get:10 http://archive.ubuntu.com/ubuntu noble-updates/main amd64 Packages [93.2 kB] Get:11 http://archive.ubuntu.com/ubuntu noble-backports/universe amd64 Packages [6387 B] Get:12 http://security.ubuntu.com/ubuntu noble-security/universe amd64 Packages [18.6 kB] Get:13 http://security.ubuntu.com/ubuntu noble-security/main amd64 Packages [37.7 kB] Fetched 22.3 MB in 3s (7608 kB/s) Reading package lists... Building dependency tree... Reading state information... 1 package can be upgraded. Run 'apt list --upgradable' to see it. WARNING: apt does not have a stable CLI interface. Use with caution in scripts. Reading package lists... Building dependency tree... Reading state information... The following additional packages will be installed: ca-certificates libpsl5t64 libssl3t64 openssl publicsuffix The following NEW packages will be installed: ca-certificates libpsl5t64 openssl publicsuffix wget The following packages will be upgraded: libssl3t64 1 upgraded, 5 newly installed, 0 to remove and 0 not upgraded. Need to get 3621 kB of archives. After this operation, 3749 kB of additional disk space will be used. Get:1 http://archive.ubuntu.com/ubuntu noble-updates/main amd64 libssl3t64 amd64 3.0.13-0ubuntu3.1 [1939 kB] Get:2 http://archive.ubuntu.com/ubuntu noble-updates/main amd64 openssl amd64 3.0.13-0ubuntu3.1 [1003 kB] Get:3 http://archive.ubuntu.com/ubuntu noble/main amd64 ca-certificates all 20240203 [159 kB] Get:4 http://archive.ubuntu.com/ubuntu noble/main amd64 libpsl5t64 amd64 0.21.2-1.1build1 [57.1 kB] Get:5 http://archive.ubuntu.com/ubuntu noble/main amd64 publicsuffix all 20231001.0357-0.1 [129 kB] Get:6 http://archive.ubuntu.com/ubuntu noble/main amd64 wget amd64 1.21.4-1ubuntu4 [333 kB] debconf: delaying package configuration, since apt-utils is not installed Fetched 3621 kB in 1s (4335 kB/s) (Reading database ... 4368 files and directories currently installed.) Preparing to unpack .../libssl3t64_3.0.13-0ubuntu3.1_amd64.deb ... Unpacking libssl3t64:amd64 (3.0.13-0ubuntu3.1) over (3.0.13-0ubuntu3) ... Setting up libssl3t64:amd64 (3.0.13-0ubuntu3.1) ... Selecting previously unselected package openssl. (Reading database ... 4368 files and directories currently installed.) Preparing to unpack .../openssl_3.0.13-0ubuntu3.1_amd64.deb ... Unpacking openssl (3.0.13-0ubuntu3.1) ... Selecting previously unselected package ca-certificates. Preparing to unpack .../ca-certificates_20240203_all.deb ... Unpacking ca-certificates (20240203) ... Selecting previously unselected package libpsl5t64:amd64. Preparing to unpack .../libpsl5t64_0.21.2-1.1build1_amd64.deb ... Unpacking libpsl5t64:amd64 (0.21.2-1.1build1) ... Selecting previously unselected package publicsuffix. Preparing to unpack .../publicsuffix_20231001.0357-0.1_all.deb ... Unpacking publicsuffix (20231001.0357-0.1) ... Selecting previously unselected package wget. Preparing to unpack .../wget_1.21.4-1ubuntu4_amd64.deb ... Unpacking wget (1.21.4-1ubuntu4) ... Setting up libpsl5t64:amd64 (0.21.2-1.1build1) ... Setting up openssl (3.0.13-0ubuntu3.1) ... Setting up publicsuffix (20231001.0357-0.1) ... Setting up wget (1.21.4-1ubuntu4) ... Setting up ca-certificates (20240203) ... debconf: unable to initialize frontend: Dialog debconf: (TERM is not set, so the dialog frontend is not usable.) debconf: falling back to frontend: Readline debconf: unable to initialize frontend: Readline debconf: (Can't locate Term/ReadLine.pm in @INC (you may need to install the Term::ReadLine module) (@INC entries checked: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.38.2 /usr/local/share/perl/5.38.2 /usr/lib/x86_64-linux-gnu/perl5/5.38 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl-base /usr/lib/x86_64-linux-gnu/perl/5.38 /usr/share/perl/5.38 /usr/local/lib/site_perl) at /usr/share/perl5/Debconf/FrontEnd/Readline.pm line 8.) debconf: falling back to frontend: Teletype Updating certificates in /etc/ssl/certs... 146 added, 0 removed; done. Processing triggers for libc-bin (2.39-0ubuntu8.1) ... Processing triggers for ca-certificates (20240203) ... Updating certificates in /etc/ssl/certs... 0 added, 0 removed; done. Running hooks in /etc/ca-certificates/update.d... done. --2024-05-27 05:32:22-- https://sh.rustup.rs/ Resolving sh.rustup.rs (sh.rustup.rs)... 108.138.246.116, 108.138.246.99, 108.138.246.70, ... Connecting to sh.rustup.rs (sh.rustup.rs)|108.138.246.116|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 26495 (26K) [application/x-sh] Saving to: 'STDOUT' 0K .......... .......... ..... 100% 72.3M=0s 2024-05-27 05:32:23 (72.3 MB/s) - written to stdout [26495/26495] info: downloading installer --2024-05-27 05:32:23-- https://static.rust-lang.org/rustup/dist/x86_64-unknown-linux-gnu/rustup-init Resolving static.rust-lang.org (static.rust-lang.org)... 151.101.2.137, 151.101.66.137, 151.101.130.137, ... Connecting to static.rust-lang.org (static.rust-lang.org)|151.101.2.137|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 14881096 (14M) [binary/octet-stream] Saving to: '/tmp/tmp.eizJxBL5rT/rustup-init' 0K .......... .......... .......... .......... .......... 0% 2.66M 5s 50K .......... .......... .......... .......... .......... 0% 9.89M 3s 100K .......... .......... .......... .......... .......... 1% 8.14M 3s 150K .......... .......... .......... .......... .......... 1% 16.6M 2s 200K .......... .......... .......... .......... .......... 1% 23.0M 2s 250K .......... .......... .......... .......... .......... 2% 8.25M 2s 300K .......... .......... .......... .......... .......... 2% 131M 2s 350K .......... .......... .......... .......... .......... 2% 23.3M 2s 400K .......... .......... .......... .......... .......... 3% 18.8M 1s 450K .......... .......... .......... .......... .......... 3% 136M 1s 500K .......... .......... .......... .......... .......... 3% 18.6M 1s 550K .......... .......... .......... .......... .......... 4% 41.3M 1s 600K .......... .......... .......... .......... .......... 4% 25.2M 1s 650K .......... .......... .......... .......... .......... 4% 109M 1s 700K .......... .......... .......... .......... .......... 5% 134M 1s 750K .......... .......... .......... .......... .......... 5% 34.5M 1s 800K .......... .......... .......... .......... .......... 5% 154M 1s 850K .......... .......... .......... .......... .......... 6% 22.4M 1s 900K .......... .......... .......... .......... .......... 6% 105M 1s 950K .......... .......... .......... .......... .......... 6% 139M 1s 1000K .......... .......... .......... .......... .......... 7% 21.2M 1s 1050K .......... .......... .......... .......... .......... 7% 135M 1s 1100K .......... .......... .......... .......... .......... 7% 103M 1s 1150K .......... .......... .......... .......... .......... 8% 85.9M 1s 1200K .......... .......... .......... .......... .......... 8% 101M 1s 1250K .......... .......... .......... .......... .......... 8% 38.9M 1s 1300K .......... .......... .......... .......... .......... 9% 108M 1s 1350K .......... .......... .......... .......... .......... 9% 147M 1s 1400K .......... .......... .......... .......... .......... 9% 100M 1s 1450K .......... .......... .......... .......... .......... 10% 100M 1s 1500K .......... .......... .......... .......... .......... 10% 135M 1s 1550K .......... .......... .......... .......... .......... 11% 86.6M 1s 1600K .......... .......... .......... .......... .......... 11% 101M 1s 1650K .......... .......... .......... .......... .......... 11% 132M 0s 1700K .......... .......... .......... .......... .......... 12% 106M 0s 1750K .......... .......... .......... .......... .......... 12% 109M 0s 1800K .......... .......... .......... .......... .......... 12% 105M 0s 1850K .......... .......... .......... .......... .......... 13% 102M 0s 1900K .......... .......... .......... .......... .......... 13% 149M 0s 1950K .......... .......... .......... .......... .......... 13% 79.4M 0s 2000K .......... .......... .......... .......... .......... 14% 39.1M 0s 2050K .......... .......... .......... .......... .......... 14% 104M 0s 2100K .......... .......... .......... .......... .......... 14% 139M 0s 2150K .......... .......... .......... .......... .......... 15% 106M 0s 2200K .......... .......... .......... .......... .......... 15% 102M 0s 2250K .......... .......... .......... .......... .......... 15% 146M 0s 2300K .......... .......... .......... .......... .......... 16% 101M 0s 2350K .......... .......... .......... .......... .......... 16% 81.7M 0s 2400K .......... .......... .......... .......... .......... 16% 144M 0s 2450K .......... .......... .......... .......... .......... 17% 106M 0s 2500K .......... .......... .......... .......... .......... 17% 131M 0s 2550K .......... .......... .......... .......... .......... 17% 109M 0s 2600K .......... .......... .......... .......... .......... 18% 102M 0s 2650K .......... .......... .......... .......... .......... 18% 145M 0s 2700K .......... .......... .......... .......... .......... 18% 104M 0s 2750K .......... .......... .......... .......... .......... 19% 85.9M 0s 2800K .......... .......... .......... .......... .......... 19% 137M 0s 2850K .......... .......... .......... .......... .......... 19% 108M 0s 2900K .......... .......... .......... .......... .......... 20% 107M 0s 2950K .......... .......... .......... .......... .......... 20% 137M 0s 3000K .......... .......... .......... .......... .......... 20% 103M 0s 3050K .......... .......... .......... .......... .......... 21% 3.25M 0s 3100K .......... .......... .......... .......... .......... 21% 3.62M 0s 3150K .......... .......... .......... .......... .......... 22% 329M 0s 3200K .......... .......... .......... .......... .......... 22% 405M 0s 3250K .......... .......... .......... .......... .......... 22% 493M 0s 3300K .......... .......... .......... .......... .......... 23% 436M 0s 3350K .......... .......... .......... .......... .......... 23% 510M 0s 3400K .......... .......... .......... .......... .......... 23% 673M 0s 3450K .......... .......... .......... .......... .......... 24% 255M 0s 3500K .......... .......... .......... .......... .......... 24% 242M 0s 3550K .......... .......... .......... .......... .......... 24% 143M 0s 3600K .......... .......... .......... .......... .......... 25% 634M 0s 3650K .......... .......... .......... .......... .......... 25% 334M 0s 3700K .......... .......... .......... .......... .......... 25% 624M 0s 3750K .......... .......... .......... .......... .......... 26% 379M 0s 3800K .......... .......... .......... .......... .......... 26% 469M 0s 3850K .......... .......... .......... .......... .......... 26% 405M 0s 3900K .......... .......... .......... .......... .......... 27% 404M 0s 3950K .......... .......... .......... .......... .......... 27% 388M 0s 4000K .......... .......... .......... .......... .......... 27% 396M 0s 4050K .......... .......... .......... .......... .......... 28% 571M 0s 4100K .......... .......... .......... .......... .......... 28% 337M 0s 4150K .......... .......... .......... .......... .......... 28% 687M 0s 4200K .......... .......... .......... .......... .......... 29% 680M 0s 4250K .......... .......... .......... .......... .......... 29% 621M 0s 4300K .......... .......... .......... .......... .......... 29% 506M 0s 4350K .......... .......... .......... .......... .......... 30% 563M 0s 4400K .......... .......... .......... .......... .......... 30% 700M 0s 4450K .......... .......... .......... .......... .......... 30% 694M 0s 4500K .......... .......... .......... .......... .......... 31% 613M 0s 4550K .......... .......... .......... .......... .......... 31% 740M 0s 4600K .......... .......... .......... .......... .......... 31% 685M 0s 4650K .......... .......... .......... .......... .......... 32% 812M 0s 4700K .......... .......... .......... .......... .......... 32% 758M 0s 4750K .......... .......... .......... .......... .......... 33% 539M 0s 4800K .......... .......... .......... .......... .......... 33% 615M 0s 4850K .......... .......... .......... .......... .......... 33% 770M 0s 4900K .......... .......... .......... .......... .......... 34% 862M 0s 4950K .......... .......... .......... .......... .......... 34% 618M 0s 5000K .......... .......... .......... .......... .......... 34% 554M 0s 5050K .......... .......... .......... .......... .......... 35% 472M 0s 5100K .......... .......... .......... .......... .......... 35% 564M 0s 5150K .......... .......... .......... .......... .......... 35% 356M 0s 5200K .......... .......... .......... .......... .......... 36% 513M 0s 5250K .......... .......... .......... .......... .......... 36% 497M 0s 5300K .......... .......... .......... .......... .......... 36% 461M 0s 5350K .......... .......... .......... .......... .......... 37% 12.0M 0s 5400K .......... .......... .......... .......... .......... 37% 192M 0s 5450K .......... .......... .......... .......... .......... 37% 305M 0s 5500K .......... .......... .......... .......... .......... 38% 496M 0s 5550K .......... .......... .......... .......... .......... 38% 322M 0s 5600K .......... .......... .......... .......... .......... 38% 348M 0s 5650K .......... .......... .......... .......... .......... 39% 206M 0s 5700K .......... .......... .......... .......... .......... 39% 424M 0s 5750K .......... .......... .......... .......... .......... 39% 387M 0s 5800K .......... .......... .......... .......... .......... 40% 164M 0s 5850K .......... .......... .......... .......... .......... 40% 136M 0s 5900K .......... .......... .......... .......... .......... 40% 409M 0s 5950K .......... .......... .......... .......... .......... 41% 486M 0s 6000K .......... .......... .......... .......... .......... 41% 34.9M 0s 6050K .......... .......... .......... .......... .......... 41% 482M 0s 6100K .......... .......... .......... .......... .......... 42% 606M 0s 6150K .......... .......... .......... .......... .......... 42% 382M 0s 6200K .......... .......... .......... .......... .......... 43% 362M 0s 6250K .......... .......... .......... .......... .......... 43% 507M 0s 6300K .......... .......... .......... .......... .......... 43% 463M 0s 6350K .......... .......... .......... .......... .......... 44% 446M 0s 6400K .......... .......... .......... .......... .......... 44% 541M 0s 6450K .......... .......... .......... .......... .......... 44% 640M 0s 6500K .......... .......... .......... .......... .......... 45% 464M 0s 6550K .......... .......... .......... .......... .......... 45% 687M 0s 6600K .......... .......... .......... .......... .......... 45% 459M 0s 6650K .......... .......... .......... .......... .......... 46% 636M 0s 6700K .......... .......... .......... .......... .......... 46% 412M 0s 6750K .......... .......... .......... .......... .......... 46% 368M 0s 6800K .......... .......... .......... .......... .......... 47% 458M 0s 6850K .......... .......... .......... .......... .......... 47% 582M 0s 6900K .......... .......... .......... .......... .......... 47% 792M 0s 6950K .......... .......... .......... .......... .......... 48% 769M 0s 7000K .......... .......... .......... .......... .......... 48% 722M 0s 7050K .......... .......... .......... .......... .......... 48% 440M 0s 7100K .......... .......... .......... .......... .......... 49% 529M 0s 7150K .......... .......... .......... .......... .......... 49% 559M 0s 7200K .......... .......... .......... .......... .......... 49% 722M 0s 7250K .......... .......... .......... .......... .......... 50% 512M 0s 7300K .......... .......... .......... .......... .......... 50% 577M 0s 7350K .......... .......... .......... .......... .......... 50% 649M 0s 7400K .......... .......... .......... .......... .......... 51% 756M 0s 7450K .......... .......... .......... .......... .......... 51% 584M 0s 7500K .......... .......... .......... .......... .......... 51% 716M 0s 7550K .......... .......... .......... .......... .......... 52% 492M 0s 7600K .......... .......... .......... .......... .......... 52% 650M 0s 7650K .......... .......... .......... .......... .......... 52% 121M 0s 7700K .......... .......... .......... .......... .......... 53% 97.7M 0s 7750K .......... .......... .......... .......... .......... 53% 84.4M 0s 7800K .......... .......... .......... .......... .......... 54% 109M 0s 7850K .......... .......... .......... .......... .......... 54% 82.8M 0s 7900K .......... .......... .......... .......... .......... 54% 3.57M 0s 7950K .......... .......... .......... .......... .......... 55% 106M 0s 8000K .......... .......... .......... .......... .......... 55% 158M 0s 8050K .......... .......... .......... .......... .......... 55% 503M 0s 8100K .......... .......... .......... .......... .......... 56% 661M 0s 8150K .......... .......... .......... .......... .......... 56% 779M 0s 8200K .......... .......... .......... .......... .......... 56% 31.4M 0s 8250K .......... .......... .......... .......... .......... 57% 220M 0s 8300K .......... .......... .......... .......... .......... 57% 135M 0s 8350K .......... .......... .......... .......... .......... 57% 95.5M 0s 8400K .......... .......... .......... .......... .......... 58% 370M 0s 8450K .......... .......... .......... .......... .......... 58% 139M 0s 8500K .......... .......... .......... .......... .......... 58% 216M 0s 8550K .......... .......... .......... .......... .......... 59% 659M 0s 8600K .......... .......... .......... .......... .......... 59% 406M 0s 8650K .......... .......... .......... .......... .......... 59% 531M 0s 8700K .......... .......... .......... .......... .......... 60% 466M 0s 8750K .......... .......... .......... .......... .......... 60% 447M 0s 8800K .......... .......... .......... .......... .......... 60% 561M 0s 8850K .......... .......... .......... .......... .......... 61% 506M 0s 8900K .......... .......... .......... .......... .......... 61% 499M 0s 8950K .......... .......... .......... .......... .......... 61% 740M 0s 9000K .......... .......... .......... .......... .......... 62% 363M 0s 9050K .......... .......... .......... .......... .......... 62% 678M 0s 9100K .......... .......... .......... .......... .......... 62% 654M 0s 9150K .......... .......... .......... .......... .......... 63% 555M 0s 9200K .......... .......... .......... .......... .......... 63% 747M 0s 9250K .......... .......... .......... .......... .......... 63% 604M 0s 9300K .......... .......... .......... .......... .......... 64% 600M 0s 9350K .......... .......... .......... .......... .......... 64% 507M 0s 9400K .......... .......... .......... .......... .......... 65% 709M 0s 9450K .......... .......... .......... .......... .......... 65% 799M 0s 9500K .......... .......... .......... .......... .......... 65% 655M 0s 9550K .......... .......... .......... .......... .......... 66% 460M 0s 9600K .......... .......... .......... .......... .......... 66% 669M 0s 9650K .......... .......... .......... .......... .......... 66% 756M 0s 9700K .......... .......... .......... .......... .......... 67% 771M 0s 9750K .......... .......... .......... .......... .......... 67% 611M 0s 9800K .......... .......... .......... .......... .......... 67% 572M 0s 9850K .......... .......... .......... .......... .......... 68% 708M 0s 9900K .......... .......... .......... .......... .......... 68% 119M 0s 9950K .......... .......... .......... .......... .......... 68% 108M 0s 10000K .......... .......... .......... .......... .......... 69% 138M 0s 10050K .......... .......... .......... .......... .......... 69% 85.5M 0s 10100K .......... .......... .......... .......... .......... 69% 108M 0s 10150K .......... .......... .......... .......... .......... 70% 132M 0s 10200K .......... .......... .......... .......... .......... 70% 111M 0s 10250K .......... .......... .......... .......... .......... 70% 101M 0s 10300K .......... .......... .......... .......... .......... 71% 156M 0s 10350K .......... .......... .......... .......... .......... 71% 107M 0s 10400K .......... .......... .......... .......... .......... 71% 134M 0s 10450K .......... .......... .......... .......... .......... 72% 88.9M 0s 10500K .......... .......... .......... .......... .......... 72% 105M 0s 10550K .......... .......... .......... .......... .......... 72% 138M 0s 10600K .......... .......... .......... .......... .......... 73% 41.9M 0s 10650K .......... .......... .......... .......... .......... 73% 84.3M 0s 10700K .......... .......... .......... .......... .......... 73% 69.0M 0s 10750K .......... .......... .......... .......... .......... 74% 110M 0s 10800K .......... .......... .......... .......... .......... 74% 131M 0s 10850K .......... .......... .......... .......... .......... 75% 85.7M 0s 10900K .......... .......... .......... .......... .......... 75% 106M 0s 10950K .......... .......... .......... .......... .......... 75% 138M 0s 11000K .......... .......... .......... .......... .......... 76% 103M 0s 11050K .......... .......... .......... .......... .......... 76% 146M 0s 11100K .......... .......... .......... .......... .......... 76% 105M 0s 11150K .......... .......... .......... .......... .......... 77% 106M 0s 11200K .......... .......... .......... .......... .......... 77% 137M 0s 11250K .......... .......... .......... .......... .......... 77% 85.1M 0s 11300K .......... .......... .......... .......... .......... 78% 107M 0s 11350K .......... .......... .......... .......... .......... 78% 106M 0s 11400K .......... .......... .......... .......... .......... 78% 145M 0s 11450K .......... .......... .......... .......... .......... 79% 103M 0s 11500K .......... .......... .......... .......... .......... 79% 108M 0s 11550K .......... .......... .......... .......... .......... 79% 130M 0s 11600K .......... .......... .......... .......... .......... 80% 106M 0s 11650K .......... .......... .......... .......... .......... 80% 86.4M 0s 11700K .......... .......... .......... .......... .......... 80% 135M 0s 11750K .......... .......... .......... .......... .......... 81% 104M 0s 11800K .......... .......... .......... .......... .......... 81% 142M 0s 11850K .......... .......... .......... .......... .......... 81% 108M 0s 11900K .......... .......... .......... .......... .......... 82% 107M 0s 11950K .......... .......... .......... .......... .......... 82% 136M 0s 12000K .......... .......... .......... .......... .......... 82% 105M 0s 12050K .......... .......... .......... .......... .......... 83% 87.3M 0s 12100K .......... .......... .......... .......... .......... 83% 131M 0s 12150K .......... .......... .......... .......... .......... 83% 101M 0s 12200K .......... .......... .......... .......... .......... 84% 109M 0s 12250K .......... .......... .......... .......... .......... 84% 142M 0s 12300K .......... .......... .......... .......... .......... 84% 107M 0s 12350K .......... .......... .......... .......... .......... 85% 135M 0s 12400K .......... .......... .......... .......... .......... 85% 103M 0s 12450K .......... .......... .......... .......... .......... 86% 86.5M 0s 12500K .......... .......... .......... .......... .......... 86% 102M 0s 12550K .......... .......... .......... .......... .......... 86% 143M 0s 12600K .......... .......... .......... .......... .......... 87% 102M 0s 12650K .......... .......... .......... .......... .......... 87% 138M 0s 12700K .......... .......... .......... .......... .......... 87% 104M 0s 12750K .......... .......... .......... .......... .......... 88% 153M 0s 12800K .......... .......... .......... .......... .......... 88% 104M 0s 12850K .......... .......... .......... .......... .......... 88% 80.2M 0s 12900K .......... .......... .......... .......... .......... 89% 106M 0s 12950K .......... .......... .......... .......... .......... 89% 144M 0s 13000K .......... .......... .......... .......... .......... 89% 108M 0s 13050K .......... .......... .......... .......... .......... 90% 132M 0s 13100K .......... .......... .......... .......... .......... 90% 104M 0s 13150K .......... .......... .......... .......... .......... 90% 145M 0s 13200K .......... .......... .......... .......... .......... 91% 108M 0s 13250K .......... .......... .......... .......... .......... 91% 84.5M 0s 13300K .......... .......... .......... .......... .......... 91% 103M 0s 13350K .......... .......... .......... .......... .......... 92% 135M 0s 13400K .......... .......... .......... .......... .......... 92% 105M 0s 13450K .......... .......... .......... .......... .......... 92% 148M 0s 13500K .......... .......... .......... .......... .......... 93% 106M 0s 13550K .......... .......... .......... .......... .......... 93% 104M 0s 13600K .......... .......... .......... .......... .......... 93% 138M 0s 13650K .......... .......... .......... .......... .......... 94% 85.1M 0s 13700K .......... .......... .......... .......... .......... 94% 107M 0s 13750K .......... .......... .......... .......... .......... 94% 74.9M 0s 13800K .......... .......... .......... .......... .......... 95% 131M 0s 13850K .......... .......... .......... .......... .......... 95% 158M 0s 13900K .......... .......... .......... .......... .......... 95% 89.2M 0s 13950K .......... .......... .......... .......... .......... 96% 143M 0s 14000K .......... .......... .......... .......... .......... 96% 94.8M 0s 14050K .......... .......... .......... .......... .......... 97% 134M 0s 14100K .......... .......... .......... .......... .......... 97% 105M 0s 14150K .......... .......... .......... .......... .......... 97% 136M 0s 14200K .......... .......... .......... .......... .......... 98% 108M 0s 14250K .......... .......... .......... .......... .......... 98% 133M 0s 14300K .......... .......... .......... .......... .......... 98% 112M 0s 14350K .......... .......... .......... .......... .......... 99% 102M 0s 14400K .......... .......... .......... .......... .......... 99% 134M 0s 14450K .......... .......... .......... .......... .......... 99% 86.2M 0s 14500K .......... .......... .......... .. 100% 216M=0.2s 2024-05-27 05:32:23 (76.0 MB/s) - '/tmp/tmp.eizJxBL5rT/rustup-init' saved [14881096/14881096] info: profile set to 'default' info: default host triple is x86_64-unknown-linux-gnu info: syncing channel updates for 'stable-x86_64-unknown-linux-gnu' info: latest update on 2024-05-02, rust version 1.78.0 (9b00956e5 2024-04-29) info: downloading component 'cargo' info: downloading component 'clippy' info: downloading component 'rust-docs' info: downloading component 'rust-std' info: downloading component 'rustc' info: downloading component 'rustfmt' info: installing component 'cargo' info: installing component 'clippy' info: installing component 'rust-docs' info: installing component 'rust-std' info: installing component 'rustc' info: installing component 'rustfmt' info: default toolchain set to 'stable-x86_64-unknown-linux-gnu' stable-x86_64-unknown-linux-gnu installed - rustc 1.78.0 (9b00956e5 2024-04-29) Rust is installed now. Great! To get started you may need to restart your current shell. This would reload your PATH environment variable to include Cargo's bin directory ($HOME/.cargo/bin). To configure your current shell, you need to source the corresponding env file under $HOME/.cargo. This is usually done by running one of the following (note the leading DOT): . "$HOME/.cargo/env" # For sh/bash/zsh/ash/dash/pdksh source "$HOME/.cargo/env.fish" # For fish ~ took 38.0s ```

conradwt commented 1 month ago

Yesterday, I also fully uninstalled and reinstalled Docker Desktop for Mac.

jaraco commented 1 month ago

Can you confirm you have rosetta installed? pgrep oahd should return a process ID if Rosetta is installed.

conradwt commented 1 month ago

pgrep oahd

The above produces the following:

➜ pgrep oahd
849

Yes, Rosetta 2 is running because Apple has a couple of Intel processes running in Activity Monitor. For example,

Screenshot 2024-05-26 at 11 07 33 PM

jaraco commented 1 month ago

Okay. I'll see if I can replicate the issue in a UTM VM. I'm worried that it won't fail, but if it does, then at least that's something that can be shipped.

jaraco commented 1 month ago

I created a UTM VM of macOS, with the hope of replicating the issue in a clean environment. Unfortunately, Docker will not run in a VM because it doesn't have access to the hypervisor.

It occurred to me I might possibly be able to replicate the issue by doing something similar to what Docker does, using Rosetta to emulate x86_64 to run Linux, and see if the issue reproduces there. It's a bit of a long shot. I couldn't find the documented setting to enable Rosetta, and installing Linux using emulation is taking forever. I see now that Rosetta functionality is to enable executing amd64 binaries on an arm64 Linux guest. I don't think I'll be able to mimic closely enough what Docker does with virtualization to replicate the issue. We'll need to rely on a real macOS machine.

I'm struggling to think how to make more progress on this issue. I'm tempted to just leave it open for now and disable Rosetta as a workaround, but ideally I'd like to get the issue to a state where it's at least theoretically solvable by Docker.

Since you're unable to replicate the issue but other people are, perhaps we could find some other people willing to run the test?

If I gave you a login on my mac mini, could you possibly use that to replicate the issue and bisect the differences between that machine and your own? If the issue doesn't reproduce in your profile on my machine, I could take over the login and bisect the differences between that profile and the one where the failure occurs.

Alternatively, is there something more I can run that will help diagnose the issue when it occurs?

jaraco commented 1 month ago

It occurred to me I might possibly be able to replicate the issue by doing something similar to what Docker does,

For the record, I did create a virtualized AMD64 Linux machine using UTM (without Rosetta). It was dog slow, taking a couple of hours to install and log in, but it finally completed I was able to confirm that the issue doesn't occur there. No big surprise, though, since it's not using Rosetta and has very limited performance (it was slow enough that the progress bars appeared during the 'installing' steps, which aren't observed on faster machines).

conradwt commented 1 month ago

Since you're unable to replicate the issue but other people are, perhaps we could find some other people willing to run the test?

Yes, I'll ask my friend if he can run the test and report the findings here.

If I gave you a login on my mac mini, could you possibly use that to replicate the issue and bisect the differences between that machine and your own? If the issue doesn't reproduce in your profile on my machine, I could take over the login and bisect the differences between that profile and the one where the failure occurs.

Yes, I can do that and report back any findings that I see.

Alternatively, is there something more I can run that will help diagnose the issue when it occurs?

I would recommend posting a reference to this issue within the Docker Slack because the Docker core team has intimate knowledge about the Docker Engine and Docker Desktop for Mac tooling. Have you tried uninstalling and reinstalling Docker Desktop For Mac? If not, I would consider giving that a shot.

jpbriend commented 1 month ago

@jaraco I think I have roughly the same laptop as you: M3 max, 36GB, Sonoma 14.5.

First I reproduced with the default Docker Desktop settings and it went well. Then I noticed in the diagnostic that the OOM killer was triggered. It seems your Docker Desktop VM is running with 3.8GB.

I changed my settings to allocate only 3.8GB to the VM (Settings - Resources - Advanced - Memory Limit). Rerunning your commandline provoked a seg fault.

Can you try to have a look at the Memory Limit setting and raise it? I tested with 8GB and it worked.

conradwt commented 1 month ago

@jpbriend Good catch because I have an M1 Max, 64 GB, and Sonoma 14.5. My memory is set to 16 GB because I tend to run several containers for a given project.

jaraco commented 1 month ago

I'm afraid that's not the issue for me. Although I have the default settings on my M1 mac mini, on my main system, I'd previously bumped the memory limit to 7.9 GB:

I bumped that limit a several weeks ago (maybe months) due to another project I was working on where I was hitting the memory limit, so most of the test reports I've done here were with that setting. Only on the mac mini were they set to the default (4GB). Is it possible the OOM in the diagnostic was from a couple of months ago?

I went ahead and bumped that to 8.8 GB, clicked Apply and Restart, then re-ran the command. The first attempt succeeded. The subsequent attempt once again triggered the segfault. I ran it twice again, both times with success. The next two attempts failed. For good measure, I bumped the memory limit to 16GB and it failed on the first attempt.

Moreover, I'd be surprised if installing rust docs was an operation requiring multiples of gigabytes to complete.

When I monitor the container memory usage during the run, it doesn't exceed 200MB.

That makes me think the memory limit is a red herring.

Still, that's great that you've managed to elicit the failure.

If I gave you a login on my mac mini, could you possibly use that to replicate the issue and bisect the differences between that machine and your own? If the issue doesn't reproduce in your profile on my machine, I could take over the login and bisect the differences between that profile and the one where the failure occurs.

Yes, I can do that and report back any findings that I see.

Can you share your SSH public key (or point me to where I can find it) and i'll set up your account. Do you have IPv6 or do I need to expose an IPv4 port?

jaraco commented 1 month ago

Have you tried uninstalling and reinstalling Docker Desktop For Mac? If not, I would consider giving that a shot.

I have not, but I have installed clean on the mac mini, others have reproduced the issue, and I have downgraded and upgraded Docker, so I'm confident the issue isn't unique to the install this machine.

jaraco commented 1 month ago

I've noticed that the issue is less severe for me today that yesterday, maybe only failing 50-60% of the time. The main difference is today I'm running on battery power instead of AC. I also tried turning down the number of CPU cores to 1 and I couldn't get it to fail after several attempts. Turning the CPU cores to 2 did trigger the failure, but less frequently, suggesting that concurrency is a concern. @conradwt could you try with cores set to 4 or 8 to see if a smaller number of cores might help replicate the issue on your system?

conradwt commented 1 month ago

@jaraco Here are my results from running it using 4 and 8 CPUs:

4 CPUs (10 iterations):  Pass: 7  Fail:  3
8 CPUs (10 iterations):  Pass: 2  Fail:  8

BTW, I used the following script:

#!/bin/bash

# Loop 10 times
for i in {1..10}
do
  echo "Iteration $i"

  # Run the Docker command
  docker run --platform linux/amd64 ubuntu:noble bash -c "apt update; apt install -y wget; wget https://sh.rustup.rs -O - | sh -s -- -y"

  # Sleep for 10 seconds
  sleep 10
done

Usage:

./github-7295 >& output.txt
cat output.txt | grep "Segmentation fault" | wc -l

saethlin commented 1 month ago

The crash occurs in installing the rust-docs component, which is infamous for causing filesystem slowdowns during installation because it creates a lot of small files in parallel, as fast as possible. If you want a better way to hit the crash, run rustup component remove rust-docs; rustup component add rust-docs a few times once rustup is installed.

I suspect that any kind of multithreaded stress testing tool for filesystems can hit this crash. I initially was concerned that this was a bug in rustup, but given that installing the docs under valgrind or strace makes the install run much slower and the crash go away, but it still crashes in gdb/lldb, this seems like a data race reached by stressing the filesystem. So, generally, anything that really stresses the filesystem is what I would try.

jaraco commented 1 month ago

Here are my results from running it using 4 and 8 CPUs:

That's great! Am I right in thinking this is the first time you've been able to replicate the failure? How many CPUs did you have Docker configured before when the issue wouldn't occur? I'm guessing it was 14 or 16 depending on your chip. If it was 1, that might explain why you couldn't replicate the issue (as concurrency was effectively disabled). If it was 14 or 16, that's surprising, because my machine fails easily with 12 configured (and yours seems to be more prone to failure with 8 vs 4).

If you want a better way to hit the crash

This was really helpful. I used this idea to create [jaraco/for-mac-issue7295]() from this Dockerfile:

FROM ubuntu:noble
RUN apt update
RUN apt install -y curl
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs > rustup-init
RUN sh rustup-init -y --profile minimal
ENV PATH=$PATH:/root/.cargo/bin
CMD rustup component add rust-docs

I did try installing with the default profile and then enacting rustup component remove rust-docs, but that approach fails when rustup tries to remove content added in an earlier layer.

Built and uploaded it with docker build --platform linux/amd64 --tag jaraco/for-mac-issue-7295 --push .

Now I (and others) can run a test quickly and easily with docker run --platform linux/amd64 -it jaraco/for-mac-issue-7295.

Inspired by Conrad's script, I created this Python script to run the command multiple times and summarize the results:

#!/usr/bin/env python
import subprocess
import sys

codes = dict(
    SIGSEGV=139,
    SIGABRT=134,
    OK=0,
)
code_names = {v: k for k, v in codes.items()}

def run_test():
    cmd = ['docker', 'run', '--platform', 'linux/amd64', 'jaraco/for-mac-issue-7295']
    proc = subprocess.run(cmd, capture_output=True)
    return proc.returncode

def run(n_runs=10):
    print(f"Running the command {n_runs} times")
    codes = [run_test() for n in range(n_runs)]
    results = [code_names.get(code, code) for code in codes]
    failures = list(filter(None, codes))
    n_failures = len(failures)
    successes = n_runs - n_failures
    pct = successes / n_runs
    print(f"Success {pct:.0%} {results}")

__name__ == '__main__' and run(*map(eval, sys.argv[1:]))

And here's an example run:

 @ py -m check-issue7295
Running the command 10 times
Success 30% ['OK', 'OK', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'OK', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV']

jaraco commented 1 month ago

With n-cpus set to just 2, the success rate is higher:

 @ py -m check-issue7295 20
Running the command 20 times
Success 85% ['SIGSEGV', 'OK', 'OK', 'SIGSEGV', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'SIGSEGV', 'OK', 'OK']

jaraco commented 1 month ago

Repeating the test with and without Rosetta confirms the high failure rate with Rosetta and low without, but also reveals a dramatic difference in performance:

Rosetta disabled

 draft @ time py -m check-issue7295 20
Running the command 20 times
Success 100% ['OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK', 'OK']
      198.36 real         0.71 user         0.34 sys

Rosetta enabled

 draft @ time py -m check-issue7295 20
Running the command 20 times
Success 20% ['OK', 'SIGSEGV', 'OK', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'OK', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'OK', 'SIGSEGV']
       33.90 real         0.72 user         0.32 sys

saethlin commented 1 month ago

I don't think that indicates a performance difference. Crashing midway through is often faster than doing the entire task.

jaraco commented 1 month ago

I don't think that indicates a performance difference. Crashing midway through is often faster than doing the entire task.

I'd considered that, but it also felt much slower. Since the performance is potentially a factor, I ran the test comparing two successful runs and it reported 3.1x latency when not using Rosetta:

 draft @ time py -m check-issue7295 1
Running the command 1 times
Success 100% ['OK']
        3.12 real         0.05 user         0.02 sys
 draft @ time py -m check-issue7295 1
Running the command 1 times
Success 100% ['OK']
        9.80 real         0.04 user         0.02 sys

So it seems about half of the 5.8x extra latency was due to the Rosetta change and half due to the early termination due to the fault.

conradwt commented 1 month ago

Here are my results from running it using 4 and 8 CPUs:

That's great! Am I right in thinking this is the first time you've been able to replicate the failure? How many CPUs did you have Docker configured before when the issue wouldn't occur? I'm guessing it was 14 or 16 depending on your chip. If it was 1, that might explain why you couldn't replicate the issue (as concurrency was effectively disabled). If it was 14 or 16, that's surprising, because my machine fails easily with 12 configured (and yours seems to be more prone to failure with 8 vs 4).

Yes, this is the first time that I replicated the issue because I wasn't running the command back-to-back within a loop. Also, my default Docker Desktop CPU setting is 4.

valdo404 commented 4 weeks ago

Same issue here:

3.320 info: syncing channel updates for 'stable-x86_64-unknown-linux-gnu'
3.764 info: latest update on 2024-05-02, rust version 1.78.0 (9b00956e5 2024-04-29)
3.766 info: downloading component 'cargo'
5.038 info: downloading component 'clippy'
5.303 info: downloading component 'rust-docs'
7.312 info: downloading component 'rust-std'
10.74 info: downloading component 'rustc'
28.38 info: downloading component 'rustfmt'
28.95 info: installing component 'cargo'
29.59 info: installing component 'clippy'
29.81 info: installing component 'rust-docs'
31.18 Segmentation fault

It only happens with a docker amd64 build.

jaraco commented 2 weeks ago

I've made the Python script available as part of the jaraco.docker package, which means it can be readily installed to and run from a Python environment, or run with pip-run. Here are my latest results on Docker 4.31:

 @ pip-run jaraco.docker -- -m jaraco.docker.check-issue7295
Running the command 10 times
Success 30% ['OK', 'SIGSEGV', 'SIGSEGV', 'OK', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'SIGSEGV', 'OK']

astuyve commented 2 weeks ago

Reporting the same issue with stable or nightly:

RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs |     sh -s -- --default-toolchain nightly-x86_64-unknown-linux-gnu -y --verbose:
0.511 info: downloading installer
2.149 info: profile set to 'default'
2.149 info: default host triple is x86_64-unknown-linux-gnu
2.150 verbose: creating update-hash directory: '/root/.rustup/update-hashes'
2.151 verbose: installing toolchain 'nightly-x86_64-unknown-linux-gnu'
2.151 verbose: toolchain directory: '/root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu'
2.152 info: syncing channel updates for 'nightly-x86_64-unknown-linux-gnu'
2.156 verbose: creating temp root: /root/.rustup/tmp
2.157 verbose: creating temp file: /root/.rustup/tmp/g0qjkpvwmr9wzdsv_file
2.157 verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-nightly.toml.sha256'
2.157 verbose: downloading with reqwest
2.426 verbose: deleted temp file: /root/.rustup/tmp/g0qjkpvwmr9wzdsv_file
2.426 verbose: no update hash at: '/root/.rustup/update-hashes/nightly-x86_64-unknown-linux-gnu'
2.426 verbose: creating temp file: /root/.rustup/tmp/jd2b20zi8o39a8ct_file.toml
2.426 verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-nightly.toml'
2.426 verbose: downloading with reqwest
2.515 verbose: checksum passed
2.541 verbose: deleted temp file: /root/.rustup/tmp/jd2b20zi8o39a8ct_file.toml
2.541 info: latest update on 2024-06-18, rust version 1.81.0-nightly (59e2c01c2 2024-06-17)
2.543 info: downloading component 'cargo'
2.543 verbose: creating Download Directory directory: '/root/.rustup/downloads'
2.544 verbose: downloading file from: 'https://static.rust-lang.org/dist/2024-06-18/cargo-nightly-x86_64-unknown-linux-gnu.tar.xz'
2.544 verbose: downloading with reqwest
3.146 verbose: checksum passed
3.147 info: downloading component 'clippy'
3.147 verbose: downloading file from: 'https://static.rust-lang.org/dist/2024-06-18/clippy-nightly-x86_64-unknown-linux-gnu.tar.xz'
3.147 verbose: downloading with reqwest
3.324 verbose: checksum passed
3.324 info: downloading component 'rust-docs'
3.324 verbose: downloading file from: 'https://static.rust-lang.org/dist/2024-06-18/rust-docs-nightly-x86_64-unknown-linux-gnu.tar.xz'
3.324 verbose: downloading with reqwest
4.640 verbose: checksum passed
4.640 info: downloading component 'rust-std'
4.640 verbose: downloading file from: 'https://static.rust-lang.org/dist/2024-06-18/rust-std-nightly-x86_64-unknown-linux-gnu.tar.xz'
4.640 verbose: downloading with reqwest
6.835 verbose: checksum passed
6.835 info: downloading component 'rustc'
6.836 verbose: downloading file from: 'https://static.rust-lang.org/dist/2024-06-18/rustc-nightly-x86_64-unknown-linux-gnu.tar.xz'
6.836 verbose: downloading with reqwest
12.54 verbose: checksum passed
12.54 info: downloading component 'rustfmt'
12.54 verbose: downloading file from: 'https://static.rust-lang.org/dist/2024-06-18/rustfmt-nightly-x86_64-unknown-linux-gnu.tar.xz'
12.54 verbose: downloading with reqwest
12.72 verbose: checksum passed
12.72 info: installing component 'cargo'
12.72 verbose: creating temp directory: /root/.rustup/tmp/nzx2ftixnzeo5spb_dir
13.30 verbose: deleted temp directory: /root/.rustup/tmp/nzx2ftixnzeo5spb_dir
13.30 info: installing component 'clippy'
13.30 verbose: creating temp directory: /root/.rustup/tmp/jrp16eqls6od27vy_dir
13.54 verbose: creating temp file: /root/.rustup/tmp/r34bncoi5nkzoy6k_file
13.54 verbose: creating temp file: /root/.rustup/tmp/fc5jojgspc8rjxm4_file
13.54 verbose: deleted temp directory: /root/.rustup/tmp/jrp16eqls6od27vy_dir
13.54 info: installing component 'rust-docs'
13.54 verbose: creating temp directory: /root/.rustup/tmp/mg2ub6ot5otvqkf__dir
13.91 sh: line 570:    46 Segmentation fault      "$@"

docker / for-mac