starkware-libs / stone-prover

Apache License 2.0
260 stars 76 forks source link

Build on macOS and aarch64 issue #15

Closed hel-kame closed 3 months ago

hel-kame commented 11 months ago

Here's a list of issues related to the native build on macOS and aarch64 that can cause compilation problems and dependency issues:

define htobe16(x) OSSwapHostToBigInt16(x)

define htole16(x) OSSwapHostToLittleInt16(x)

define be16toh(x) OSSwapBigToHostInt16(x)

define le16toh(x) OSSwapLittleToHostInt16(x)

define htobe32(x) OSSwapHostToBigInt32(x)

define htole32(x) OSSwapHostToLittleInt32(x)

define be32toh(x) OSSwapBigToHostInt32(x)

define le32toh(x) OSSwapLittleToHostInt32(x)

define htobe64(x) OSSwapHostToBigInt64(x)

define htole64(x) OSSwapHostToLittleInt64(x)

define be64toh(x) OSSwapBigToHostInt64(x)

define le64toh(x) OSSwapLittleToHostInt64(x)



- **`<mmintrin.h>`** in [third_party/blake2/blake2s.c](https://github.com/starkware-libs/stone-prover/blob/main/src/third_party/blake2/blake2s.c) : <br />
On **`aarch64`** architectures, the **`<mmintrin.h>`** libraries used are not compatible. This library is specific to **`x86`** and **`x86_64`** processors, using **`SSE`** and **`MMX`** instructions. 

- Parallel compilation using **`make -j8`** led to compilation errors in my case. I had to reduce it to **`6`**, probably due to a limitation of my computer's system resources.
-  Sometimes, the **`CTest -V`** command unexpectedly and randomly freezes on any `"Global test environment tear-down"` of any test. I'm in the process of understanding the behavior for further clarification of the problem.
- Downgrading from clang to clang-12 on macOSX is much more complex than installing via package manager. Dependency installations are completely different from BSD/Linux distribution too. As well as the need to readapt the script according to the package manager and distribution used. (refer to [install_deps.sh](https://github.com/starkware-libs/stone-prover/blob/main/install_deps.sh)).

The **`Dockerfile`** works in my case on a Mac M1 with an **`aarch64`** architecture and it's the same for other similar machines. I think the problem certainly lies in the **`qemu`** emulation used by Docker, the process overlay caused by the emulation slows down Docker build drastically, on average it takes 20-30 minutes but it's the fastest solution to fix as it is. I think it's best to start by focusing on what causes emulation errors in some cases and why it would work in some cases with a similar architecture and configuration.
hel-kame commented 11 months ago

I've just tested a conclusive alternative on several machines running macOSX and arm64 that fixes the problem linked to the illegal instruction of the Docker build.

Since Docker Desktop update 4.25, it allows Rosetta emulation for x86/x86_64/amd64 binaries instead of using QEMU, which drastically slows down performance and costs a lot of resources on macOSX.

The condition for using Rosetta for Docker Desktop is simply to have a version of Docker Desktop ≥= 4.25. Check that Rosetta is installed on your machine, or update it by running the following command from the command line:

softwareupdate --install-rosetta

To enable Rosetta emulation, follow these steps:

rosetta

  1. Navigate to Docker Desktop Settings.
  2. General > Select "Use Rosetta for x86/amd64 on Apple Silicon".
  3. Don't forget to Apply and Restart.

More than just quick fix, it considerably increases Docker build speed. I'm still trying to figure out what the problem is with QEMU and what could be causing the illegal instruction with this emulator. Feel free to leave your feedback on this solution!

hel-kame commented 11 months ago

I'd just like to add a clarification from the official Docker documentation. (https://docs.docker.com/desktop/troubleshoot/known-issues/)

Capture d’écran 2023-12-11 à 12 14 22

And the same documentation recommends using Rosetta for Macs with Apple Silicon. I think it's best to either go for a solution that solves the problems of the native build, or switch to Rosetta to use the Dockerfile.