commercialhaskell / stack

The Haskell Tool Stack
http://haskellstack.org
BSD 3-Clause "New" or "Revised" License
3.98k stars 842 forks source link

Alternatives to a self-hosted runner to build statically-linked Stack for Linux/AArch64? #6531

Open mpilgrem opened 6 months ago

mpilgrem commented 6 months ago

Stack currently uses a GitHub workflow and a self-hosted runner to build a statically-linked Stack for Linux/AArch64.

Historically, the runner has been provided by FP Complete. It provides Ubuntu 20.04.2 LTS on AArch64. The statically-linked Stack is then built in an Docker container providing Alpine Linux, with:

/usr/local/bin/stack etc/scripts/release.hs build --alpine --build-args --docker-stack-exe=image

The Docker image being specified (currently) by:

docker:
  enable: false
  repo: quay.io/benz0li/ghc-musl:9.6.4

@benz0li having managed to compile versions of GHC that work on Alpine Linux/AArch64.

The Haskell Foundation tried to provide a replacement runner but its machine provides NixOS on AArch64 and, as @chreekat has explained elsewhere, exposure to that environment is not good for building binaries.

I am thinking about possible alternatives. One thing I am going to explore - it may fail - is that GitHub runner macos-14 provides an AArch64 machine architecture and Docker is available for macOS/AArch64.

hasufell commented 6 months ago

I have runners for aarch64 linux that are not based on NixOS. They are currently hooked up to:

I'm already building stack binaries there and it works.

The problem is you can't share runners across repos/orgs. So you have to start a separate agent on the same machine and they don't know about each other, potentially leading to resource exhaustion if they happen to run at the same time.

mpilgrem commented 6 months ago

@hasufell, thanks - your observation tallies with your issue https://github.com/commercialhaskell/stack/issues/6252, which discussion was primarily about macOS/AArch64 but is now pertinent for Linux/AArch64.

mpilgrem commented 6 months ago

My experiment is not succeeding, so far, with:

Raw command: /opt/homebrew/bin/docker inspect quay.io/benz0li/ghc-musl:9.6.4
Standard output:
[]
Standard error:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

To investigate: (1) what is the source of that (2) can it be overcome?

EDIT: Perhaps it is that I need to start the Docker daemon manually: https://docker-docs.uclv.cu/config/daemon/#start-the-daemon-manually

EDIT2: That part of the Docker documentation appears to be misleading: https://github.com/moby/moby/issues/27102

EDIT3: It seems possible that it is simply not possible to connect to the Docker daemon on macOS from the command line:

EDIT4: These blog posts may be of assistance:

mpilgrem commented 6 months ago

Using https://github.com/abiosoft/colima seemed promising, but colima fails to start, with (edited):

level=info msg="[hostagent] hostagent socket created at /Users/runner/.colima/_lima/colima/ha.sock"
level=info msg="[hostagent] Using system firmware (\"/opt/homebrew/share/qemu/edk2-aarch64-code.fd\")"
level=info msg="[hostagent] Starting QEMU (hint: to watch the boot progress, see \"/Users/runner/.colima/_lima/colima/serial*.log\")"
level=info msg="SSH Local Port: 49212"
level=info msg="[hostagent] Waiting for the essential requirement 1 of 4: \"ssh\""
level=info msg="[hostagent] Driver stopped due to error: \"signal: abort trap\""
level=info msg="[hostagent] Shutting down the host agent"
level=warning msg="[hostagent] failed to exit SSH master"
level=info msg="[hostagent] Shutting down QEMU with ACPI"
level=warning msg="[hostagent] Failed to remove SSH binding for port 49212"
level=warning msg="[hostagent] failed to open the QMP socket \"/Users/runner/.colima/_lima/colima/qmp.sock\", forcibly killing QEMU"
level=info msg="[hostagent] QEMU has already exited"
level=fatal msg="exiting, status={Running:false Degraded:false Exiting:true Errors:[] SSHLocalPort:0} (hint: see \"/Users/runner/.colima/_lima/colima/ha.stderr.log\")"
level=fatal msg="error starting vm: error at 'creating and starting': exit status 1"
##[error]Process completed with exit code 1.

This is a known problem: the macos-14 runner uses M1 machine architecture, M1 does not support nested virtualisation, and colima needs the latter: https://github.com/abiosoft/colima/issues/970.

mpilgrem commented 6 months ago

However, colima may offer a local solution because the virtualisation is not nested. On my Mac mini/M1:

brew install docker
brew install colima
stack upgrade --force-download # Ensure Stack version same as in Docker image
colima start
stack etc/scripts/release.hs build --alpine --build-args --docker-stack-exe=image

built executable files but, understandably, failed when the Haskell script came to test them in macOS:

_release/bin/stack-2.15.4.1-osx-aarch64/stack --version
_release/bin/stack-2.15.4.1-osx-aarch64/stack: _release/bin/stack-2.15.4.1-osx-aarch64/stack: cannot execute binary file

EDIT: If I create an Alpine Linux VM on my Mac mini with UTM and share the relevant directory with the VM, I can, however, test the executable:

/media/share/GitHub/commercialhaskell/stack/_release/bin/stack-2.15.4.1-osx-aarch64/stack --version
Version 2.15.4.1, Git revision c34001760ea06ca44537e3a0ead349b98006f324 (10548 commits) RELEASE-CANDIDATE aarch64 hack-0.436.0
benz0li commented 6 months ago

https://github.blog/changelog/2023-10-30-accelerate-your-ci-cd-with-arm-based-hosted-runners-in-github-actions/

hasufell commented 6 months ago

The private beta is enabled in the haskell org. But those runners are not free.

mpilgrem commented 6 months ago

@benz0li, may I ask - the (statically-linked?) GHC for Alpine Linux/AArch64 that you build - are the binary distributions separately available for use on the OS, or are they only published as part of a Docker image?

mpilgrem commented 6 months ago

I implemented my 'local' macOS/AArch64 solution as a separate experimental Haskell script etc/scripts/release-linux-aarch64.hs.

benz0li commented 6 months ago

@benz0li, may I ask - the (statically-linked?) GHC for Alpine Linux/AArch64 that you build

They are dynamically-linked

are the binary distributions separately available for use on the OS, or are they only published as part of a Docker image?

and only available as part of the docker images.

ℹ️ I have no plans to upload the bindists to a permanent public storage.


That is the reason for /etc/stack.yaml

# Use only the GHC available on the PATH
system-ghc: true
# Do not automatically install GHC when necessary
install-ghc: false

in the Dev Containers. Use --no-install-ghc --system-ghc with the docker images.

Cross Reference: https://github.com/commercialhaskell/stack/issues/6141#issuecomment-1575595293 ff

mpilgrem commented 6 months ago

@benz0li, noted. My reason for asking is that I tried to use the GHC project's own GHC 9.8.2 for Alpine Linux 3.18/AArch64 in an Alpine Linux 3.19.1 VM on macOS/AArch64 to build a statically-linked Stack, but it fell over. (It worked fine when building a dynamically-linked Stack.) However, I can now build a statically-linked Stack for Linux/AArch64 if I use your Docker images on a Mac mini/M1 locally.

benz0li commented 6 months ago

My reason for asking is that I tried to use the GHC project's own GHC 9.8.2 for Alpine Linux 3.18/AArch64 in an Alpine Linux 3.19.1 VM on macOS/AArch64 to build a statically-linked Stack, but it fell over.

I was hoping that the official Alpine Linux[/AArch64] releases would work correctly^1 – and make my project obsolete.

As long as this is not the case I will continue to maintain https://github.com/benz0li/ghc-musl.

benz0li commented 1 week ago

An alternative might be building the image under emulation with QEMU.

Cross references:

This allows building linux/arm64 images on x86_64 by passing --platform linux/arm64 to the docker build command.


Emulation with QEMU can be much slower than native builds, especially for compute-heavy tasks like compilation and compression or decompression.

Multi-platform | Docker Docs > Strategies > QEMU

benz0li commented 1 week ago

@mpilgrem This is how I build Stack v3.1.1[^1] on AArch64 for the glcr.b-data.ch/ghc/ghc-musl:9.10.1-linux-riscv64 image.

[^1]: using glcr.b-data.ch/ghc/ghc-musl:9.8.2-linux-riscv64

Cross reference:

benz0li commented 1 week ago

ℹ️ I may start building and releasing unofficial and untested RISC-V (64-bit) release assets at https://gitlab.b-data.ch/commercialhaskell/stack/-/releases until official RISC-V release assets are available.

benz0li commented 1 day ago

My reason for asking is that I tried to use the GHC project's own GHC 9.8.2 for Alpine Linux 3.18/AArch64 in an Alpine Linux 3.19.1 VM on macOS/AArch64 to build a statically-linked Stack, but it fell over.

@mpilgrem Due to https://gitlab.haskell.org/ghc/ghc/-/issues/25093?