r-hub / rhub

R-hub API client
https://r-hub.github.io/rhub/
Other
353 stars 52 forks source link

Make images work in Rosetta 2 on macOS ARM #617

Open krlmlr opened 5 months ago

krlmlr commented 5 months ago

I'm getting

        29 Illegal instruction     (core dumped) | R_DEFAULT_PACKAGES= LC_COLLATE=C "${R_HOME}/bin/R" $myArgs --no-echo --args ${args}

when trying to run R CMD INSTALL . on macOS with Colima under Rosetta emulation. You mentioned earlier that this may be expected, but do you actually see a way to fix this? Or perhaps to work around? Happy to share my setup instructions -- if it works there, it's likely to work everywhere (in the words of Frank Sinatra). Thanks!

gaborcsardi commented 5 months ago

IDK of any workarounds, apart from turning off Rosetta: https://github.com/docker/roadmap/issues/384

Emulating Fedora on arm64 is also very slow, in general, also on Linux. Much slower than Ubuntu. I suspect that this is because Fedora is compiled for more modern CPUIs, that are harder to emulate. E.g. an R build that takes about 10 minutes natively does not finish in 6 hours on emulated x86_64 Fedora. So if you turn off Rosetta, and qemu happens to work (which might not, qemu has a different set of similar issues), prepare that it'll be very-very slow.

We could also try to build arm64 images, the problem with that is that they sometimes does not reproduce the problems on CRAN.

krlmlr commented 5 months ago

Thanks. Colima is https://github.com/abiosoft/colima, an OSS alternative to Docker Desktop. I haven't tried with Docker Desktop yet.

Should we try with arm64 images for one platform, and evaluate? I see how this might give different results for valgrind and asan, but I can't even get past the R CMD INSTALL . stage.

gaborcsardi commented 5 months ago

Thanks. Colima is abiosoft/colima, an OSS alternative to Docker Desktop. I haven't tried with Docker Desktop yet.

That should not matter, they both use qemu or Rosetta. This issue is with Rosetta.

Should we try with arm64 images for one platform, and evaluate?

We can't build arm64 Fedora images on GHA, but you can try to build the R builds and images locally. Some will build OOTB. Other Dockerfiles or R builds have am64/x86_64 hardwired, so you'd need to update those.

krlmlr commented 5 months ago

Can't we build arm64 on the macOS runners? Agree that test-driving locally is better. I can take a stab if the itch starts hurting badly.

gaborcsardi commented 5 months ago

Can't we build arm64 on the macOS runners?

AFAICT they cannot run Docker.

krlmlr commented 5 months ago

Frustrating.

What if we built additional Ubuntu variants for all images to support this use case? What differences do you expect compared to Fedora? Just want to see if that makes any sense.

gaborcsardi commented 5 months ago

That's obviously extra work, newer versions of gcc are easily available on Fedora, but not on Ubuntu, and the custom BLAS/LAPACK issues typically do not reproduce on Ubuntu (or on arm64 for that matter). These are the reasons I started using Fedora, in addition to Ubuntu.

krlmlr commented 5 months ago

Desperate call: brew install orbstack . Is this likely to work?

OrbStack includes many fixes for common Rosetta bugs that affect other solutions, so it's unlikely that you'll run into any issues with this enabled.

https://docs.orbstack.dev/settings#use-rosetta-to-run-intel-code

krlmlr commented 5 months ago

Works for me: running R CMD INSTALL . on clang19 now.

krlmlr commented 5 months ago

Trying gcc14 now.

krlmlr commented 5 months ago

Not even an error message 😭 ...

krlmlr commented 5 months ago

https://github.com/orbstack/orbstack/issues/1252 . Let's see.

Thank you for your patient explanations!

gaborcsardi commented 5 months ago

Yes, orbstack probably uses qemu instead of Rosetta. You can probably turn off Rosetta for colima as well, if you prefer that: https://github.com/abiosoft/colima/pull/555

gaborcsardi commented 5 months ago

GH is planning to roll out free arm64 Linux runners for open source repos, if I understand correctly, then we'll be able to build arm64 images much easier.

krlmlr commented 5 months ago

Okay. I think I understand better now.

I tried Arch Linux today. AFAIR, it is one of the distros that aims at supporting bleeding-edge components. The archlinux:latest image comes with gcc 14 preinstalled.

# gcc --version
gcc (GCC) 14.1.1 20240522
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I managed to install and start R, all on x86_64 with Rosetta 2:

# uname -a
Linux b2f4b1cff060 6.7.12-orbstack-00202-g57474688ffbd #1 SMP Mon May 27 09:21:34 UTC 2024 x86_64 GNU/Linux

Could we achieve this goal in ArchLinux (or in another distro)?

Full command line (I have DOCKER_DEFAULT_PLATFORM=linux/amd64 on my system):

docker run --rm -ti archlinux sh -c 'pacman -Sy --noconfirm r icu gcc && uname -a && gcc --version && R -q -e TRUE'
gaborcsardi commented 5 months ago

That is not so simple, for a couple of reasons.

First, we don't have system dependency support for Arch Linux.

Ideally we also create binary packages at https://github.com/r-hub/repos, otherwise it takes forever to compile all dependencies from source.

Arch Linux has a rolling release model, which means that the binary R packages would need to be rebuilt frequently.

Also, I suspect that some of the CRAN errors only happen on AVX (etc.) builds of BLAS/LAPACK. This is why we cannot reproduce them on Ubuntu or on arm64, only on x86_64 Fedora.

krlmlr commented 5 months ago

Let's wait for the arm64 builders, then. Are you planning to add binary packages for arm64?

gaborcsardi commented 5 months ago

IDK, maybe. But again, arm64 containers will not reproduce some of the CRAN issues, so I am not sure if they'd be useful.

krlmlr commented 5 months ago

I hear that. There are two kinds of gcc14/clang19 issues:

I'll leave the issue open for now.

gaborcsardi commented 5 months ago

FWIW qemu is not a joyride, either:

> pak::pak("mockery")
! Failed to update system requirement mappings, will use cached mappings.

→ Will install 1 package.
→ The package (20.75 kB) is cached.
+ mockery   0.4.4 [bld]
✔ All system requirements are already installed.

ℹ No downloads are needed, 1 pkg (20.75 kB) is cached
ℹ Building mockery 0.4.4
✖ Failed to build mockery 0.4.4 (6.7s)
Error:
! error in pak subprocess
Caused by error in `stop_task_build(state, worker)`:
! Failed to build source package mockery.
Full installation output:
* installing *source* package ‘mockery’ ...
** package ‘mockery’ successfully unpacked and MD5 sums checked
staged installation is only possible with locking
** using non-staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
ERROR: lazy loading failed for package ‘mockery’
* removing ‘/tmp/Rtmp39pfVQ/pkg-lib6a7fcf407e/mockery’
Type .Last.error to see the more details.