NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.39k stars 14.34k forks source link

Cross compiling to x86_64-darwin from aarch64-darwin seems broken #180771

Closed lf- closed 1 month ago

lf- commented 2 years ago

Describe the bug

It's currently seemingly impossible to cross compile for x86_64-darwin from aarch64-darwin, since the darwin SDK for x86_64 is old. #176661 provides some way to hack the stdenv to use the correct toolchain, but I have no idea how to integrate it with Rust.

I am testing against master, which should have that patch.

Specifically, I'm trying to build a Rust hello world for x86_64, which is trivial with cargo build --target x86_64-apple-darwin but I can't figure out how to get nixpkgs to generate that invocation with the cargoBuildHook. I've tried pkgsCross.myPackage, but that's maybe the wrong approach?

Steps To Reproduce

Steps to reproduce the behavior:

nix-build --show-trace -A pkgsCross.x86_64-darwin.cargo
error: don't yet have a `targetPackages.darwin.LibsystemCross for x86_64-apple-darwin

Expected behavior

Can build rust programs for x86_64-darwin

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

Notify maintainers

@reckenrode @thefloweringash

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

dev/nixpkgs - [master●] » nix-shell -p nix-info --run "nix-info -m"

 - system: `"aarch64-darwin"`
 - host os: `Darwin 21.5.0, macOS 12.4`
 - multi-user?: `yes`
 - sandbox: `no`
 - version: `nix-env (Nix) 2.9.1`
 - channels(root): `"nixpkgs"`
 - channels(jade): `""`
 - nixpkgs: `/nix/store/laymyq0d4724hm86igwdbsv0vl1hxfgz-nixpkgs-src`
zaldnoay commented 2 years ago

Confirm that other packages have this problem as well.

nix-repl> pkgs.pkgsCross.x86_64-darwin.wget      
error: don't yet have a `targetPackages.darwin.LibsystemCross for x86_64-apple-darwin`
«derivation 
nix-repl> pkgs.pkgsCross.x86_64-darwin.curl
error: don't yet have a `targetPackages.darwin.LibsystemCross for x86_64-apple-darwin`
«derivation 
nix-repl> pkgs.pkgsCross.x86_64-darwin.python
error: don't yet have a `targetPackages.darwin.LibsystemCross for x86_64-apple-darwin`
«derivation

It looks like LibsystemCross is missing and LibsystemCross only finded in apple-sdk-11.0. Is there any way to switch apple-sdk to apple-sdk-11.0?

Edit 1: Aarch64 darwin's sdk has LibsystemCross which mean it use version 11.0 sdk. But x86_64 darwin hasn't. This line of code shows that aarch64 uses version 11.0 of the sdk and x86_64 uses an older version of the sdk . So override the sdk version should fix this problem?

lf- commented 2 years ago

That would be a full stdenv rebuild and also break Nix on old macOS versions. I would support it regardless though. Testing whether this would work would be a hefty compilation session, I expect. I'll have a light try at it today and see if I can get it any further.

I don't think I have time to follow through with a full patch set as we ended up solving the problem this caused at work by not using Nix for the relevant project.

reckenrode commented 2 years ago

I have a bootstrapping 11.0 stdenv on x86_64-darwin that is waiting on #181550 before I can open a PR (since it needs an updated bootstrap-tools). It is currently targeted at selectively using the 11.0 SDK when the 10.12 SDK won’t work. As far as I am aware, the plan for nixpkgs in general is still to work towards bumping the source-based SDK rather than switch to the Apple SDK.

zaldnoay commented 2 years ago

@lf- Is possible to override the SDK version in cross compiling environment of aarch64-darwin host shell? Could this be a workaround to the problem?

Strum355 commented 2 years ago

anyone any workarounds until https://github.com/NixOS/nixpkgs/pull/180931 lands and/or has a rough timeline when https://github.com/NixOS/nixpkgs/pull/180931 will land? This is blocking me on a C++ codebase as well

reckenrode commented 2 years ago

Unfortunately, I haven’t had the bandwidth to get back to #180931. I can take a look at splitting some of the packages fixes into separate PRs, but the bootstrap-tools/stdenv stuff definitely won’t happen for 22.11.

anyone any workarounds until #180931 lands and/or has a rough timeline when #180931 will land? This is blocking me on a C++ codebase as well

Do you need actually cross-built tools or just access to the x86_64-darwin versions? If the latter, you can use pkgsx86_64Darwin to access x86_64-darwin packages from aarch64-darwin (e.g., pkgsx86_64Darwin.cargo).

Strum355 commented 2 years ago

Do you need actually cross-built tools or just access to the x86_64-darwin versions? If the latter, you can use pkgsx86_64Darwin to access x86_64-darwin packages from aarch64-darwin (e.g., pkgsx86_64Darwin.cargo).

Im not entirely clued into the distinction here, so Ill give a quick tldr of what Im trying to do. Im trying to cross-compile https://github.com/salesforce/p4-fusion for x86_64 from aarch64, as one of the dependencies (helix-core-api) only provides .h and .a files for x86_64 targets, and then run the resulting x86_64 binary transparently via rosetta. It has some other system dependencies (zlib, libiconv etc) that also need to be provided as inputs.

Current unrefined (and definitely not functional yet) flake:

flake.nix ```nix { description = "sample text"; outputs = { self, nixpkgs }: { packages.aarch64-darwin.default = let pkgs = nixpkgs.legacyPackages.aarch64-darwin.pkgsCross.x86_64-darwin; in pkgs.clangStdenv.mkDerivation rec { name = "p4-fusion"; version = "v1.12"; srcs = [ (pkgs.fetchFromGitHub { inherit name; owner = "salesforce"; repo = "p4-fusion"; rev = "3ee482466464c18e6a635ff4f09cd75a2e1bfe0f"; sha256 = "sha256-rUXuBoXuOUanWxutd7dNgjn2vLFvHQ0IgCIn9vG5dgs="; }) (pkgs.fetchzip { name = "helix-core-api"; url = "https://cdist2.perforce.com/perforce/r21.1/bin.macosx1015x86_64/p4api.tgz"; }) ]; sourceRoot = name; preBuild = '' mkdir -p ./vendor/helix-core-api/mac cp -R ../helix-core-api ./vendor/helix-core-api/mac ''; nativeBuildInputs = with pkgs; [ openssl cmake libiconv pkg-config pcre libssh2 zlib gss openssh_gssapi p4 ]; buildInputs = with pkgs; [ openssl libiconv pcre libssh2 zlib gss openssh_gssapi p4 ]; }; }; } ```
reckenrode commented 2 years ago

Im not entirely clued into the distinction here, so Ill give a quick tldr of what Im trying to do. Im trying to cross-compile https://github.com/salesforce/p4-fusion for x86_64 from aarch64, as one of the dependencies (helix-core-api) only provides .h and .a files for x86_64 targets, and then run the resulting x86_64 binary transparently via rosetta. It has some other system dependencies (zlib, libiconv etc) that also need to be provided as inputs.

Thanks for the explanation. The difference is pkgsCross uses an aarch64-darwin to x86_64-darwin cross-compiler while pkgsx86_64Darwin uses x86_64-darwin packages running under Rosetta 2.

Current unrefined (and definitely not functional yet) flake:

flake.nix

There are a couple of ways you can achieve what you are trying to do without using pkgsCross. One would be to use pkgs.pkgsx86_Darwin where you use pkgs currently (e.g., pkgs.pkgsx86_64Darwin.clangStdenv.mkDerivation, etc). The other is to write your derivation for x86_64-darwin and pass --system x86_64-darwin to nix build.

You can also assign the x86_64-darwin package to the aarch64-darwin, which is what I did in the modified flake below. Note that I had to make a few changes to get it to build.

The LibreSSL thing is a terrible hack, but I wanted to avoid linking against OpenSSL 1.0.2 if I could because it’s marked as insecure in nixpkgs and unsupported upstream.

flake.nix ``` { description = "sample text"; inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable"; outputs = { self, nixpkgs }: { packages.aarch64-darwin.default = self.packages.x86_64-darwin.default; packages.x86_64-darwin.default = let pkgs = import nixpkgs { system = "x86_64-darwin"; }; in pkgs.clangStdenv.mkDerivation rec { name = "p4-fusion"; version = "v1.12"; srcs = [ (pkgs.fetchFromGitHub { inherit name; owner = "salesforce"; repo = "p4-fusion"; rev = "3ee482466464c18e6a635ff4f09cd75a2e1bfe0f"; hash = "sha256-rUXuBoXuOUanWxutd7dNgjn2vLFvHQ0IgCIn9vG5dgs="; }) (pkgs.fetchzip { name = "helix-core-api"; url = "https://cdist2.perforce.com/perforce/r21.1/bin.macosx1015x86_64/p4api.tgz"; hash = "sha256-KctrQcglwEHav+9m7ipw0fX4dds079/TVFlKONYlQeQ="; }) ]; sourceRoot = name; postUnpack = '' echo 'extern "C" void SSL_COMP_free_compression_methods(void) { }' > $sourceRoot/p4-fusion/libressl.cc ''; preBuild = '' mkdir -p $NIX_BUILD_TOP/$sourceRoot/vendor/helix-core-api/mac cp -R $NIX_BUILD_TOP/helix-core-api/* $NIX_BUILD_TOP/$sourceRoot/vendor/helix-core-api/mac ''; buildInputs = with pkgs; with darwin.apple_sdk.frameworks; [ cmake pkg-config libressl libiconv pcre (libssh2.override { openssl = libressl; }) zlib gss openssh_gssapi p4 CFNetwork Cocoa ]; postInstall = '' mkdir -p "$out/bin" cp p4-fusion/p4-fusion "$out/bin/p4-fusion" install_name_tool "$out/bin/p4-fusion" \ -change '@rpath/libgit2.1.4.dylib' "$out/lib/libgit2.1.4.dylib" ''; }; }; } ```
ShamrockLee commented 1 year ago

anyone any workarounds until #180931 lands and/or has a rough timeline when #180931 will land? This is blocking me on a C++ codebase as well

There's an SDK packaged along Darling, a Darwin/macOS emulation layer for Linux. The author of the PR that packages darling demonstrates its cross-build ability in one PR comment https://github.com/NixOS/nixpkgs/pull/227765#issuecomment-1530593464.

Maybe someone could add a pkgsCross target called x86_64-darling using that SDK.

Cc: @zhaofengli

reckenrode commented 1 year ago

180931 is pretty much dead. I need to pull the fix commits out into separate PRs that are still applicable, but I don’t think I’ll be going forward with that approach. My current focus is on bumping LLVM in #234710 and making that process more sustainable. After that, I have other things I need to look at (e.g., updating DXVK, landing some Wine changes, looking at Mesa on Darwin) first, then I might be able to pull out those PRs, but that won’t help with this issue.

My recommendation on aarch64-darwin is to use pkgsx86_64Darwin to run things under Rosetta if you need x86_64-darwin binaries. It’s not ideal, but there’s still more work needing to be done on Darwin cross, which I’m not sure anyone is actively doing at the moment.

anka-213 commented 1 year ago

I'm running into this issue when trying to use static builds on x86_64-darwin using pkgsStatic. It's technically not cross-compilation as far as I could tell, so I wonder if it would be possible to just use Libsystem directly instead of LibsystemCross, but when I try that I get infinite recursion in nix which I have no clue how to debug.

reckenrode commented 1 year ago

Static builds are implemented as a cross build to a system with isStatic set to true, so they technically are cross builds.

The problem with x86_64-darwin static builds is the same as regular cross builds. The source-based SDK needs some work done to make cross builds work. Once that’s done, I expect static builds for x86_64-darwin should also just work.

anka-213 commented 1 year ago

So do you know why making LibsystemCross an alias for Libsystem works fine on aarch64-darwin (which uses apple-sdk-11.0) but causes an infinite recursion on x86_64-darwin (which gets Libsystem from apple-source-releases)? Also, is there a better way to debug infinite recursion than --show-trace, which doesn't seem to want to show the actual cause of the recursion?

anka-213 commented 1 year ago

I think I figured out the source of the loop. The apple-source-releases version of Libsystem depends on stdenv.cc https://github.com/NixOS/nixpkgs/blob/ac08ee94ac593bda4dcc276f1f99c59bdba49362/pkgs/os-specific/darwin/apple-source-releases/Libsystem/default.nix#L171 which (I believe?) in turn depends on Libsystem, while the apple-sdk-11.0 version of Libsystem uses stdenvNoCC: https://github.com/NixOS/nixpkgs/blob/ac08ee94ac593bda4dcc276f1f99c59bdba49362/pkgs/os-specific/darwin/apple-sdk-11.0/libSystem.nix#L3 which prevents this recursion. I haven't figured out why cc depends strictly on it though, instead of just being lazily recursive, only in attributes, which would be fine, since we only need some attributes, not the full derivation.


Edit: Hmm, commenting out those dependencies on cc in Libsystem/default.nix gives me a bunch of errors elsewhere that stdenv.cc == null, so clearly something is very wrong. Maybe I should just leave this to someone who knows what they're doing.

reckenrode commented 1 year ago

So do you know why making LibsystemCross an alias for Libsystem works fine on aarch64-darwin (which uses apple-sdk-11.0) but causes an infinite recursion on x86_64-darwin (which gets Libsystem from apple-source-releases)?

The SDK structure is different between the 10.12 and 11.0 SDKs. The 11.0 SDK mostly just vendors stuff from the SDK while the 10.12 SDK actually builds stuff from the source releases. The stdenv has to jump through a few hoops to build stuff on the source-based SDK that’s a non-issue with the 11.0 SDK.

Also, is there a better way to debug infinite recursion than --show-trace, which doesn't seem to want to show the actual cause of the recursion?

Not really. Debugging infinite recursions sucks. I put off fixing the pkgStatic issue with the clang 16 stdenv for about a month because I didn’t want to deal with it (working on fixing other packages isntead).

Edit: If you’re using the new CLI, you can pass --debugger to nix build, which lets you inspect things, but you have to be careful not to touch the value that caused the infinite recursion (because it will cause another one).

reckenrode commented 1 year ago

I think I figured out the source of the loop. The apple-source-releases version of Libsystem depends on stdenv.cc

The reasoning in the comment seems suspect to me. Do we really need to match glibc, especially since aarch64-darwin doesn’t even use libresolv from nixpkgs (though it could)? I wonder if it would be good enough to include libresolv as a propagatedBuildInput. 🤔

reckenrode commented 1 year ago

Edit: Hmm, commenting out those dependencies on cc in Libsystem/default.nix gives me a bunch of errors elsewhere that stdenv.cc == null, so clearly something is very wrong. Maybe I should just leave this to someone who knows what they're doing.

That’s because the Darwin stdenv uses assertions to make sure packages have the correct provenance (from the bootstrap tools, build by them, build by the final compiler, etc). On x86_64-darwin, it assumes that Libsystem is built with a regular stdenv. Changing it to build with stdenvNoCC means those assertions need updating.

After I made that change, it hits an infinite recursion in ncurses. That’s because Libsystem exports the headers from ncurses. If I drop ncurses from Libsystem, it starts building stuff until it fails in xnu. I’m looking at that now. If I can get past xnu, it seems like the rest should be relatively smooth sailing.

reckenrode commented 1 year ago

Replacing the target prefix with the following (copied from cctools but used elsewhere) fixes the xnu build failure and infinite recursion. I also cleaned up the derivation a bit. I’m going to take a break then let this build overnight.

targetPrefix = lib.optionalString (targetPlatform != hostPlatform) "${targetPlatform.config}-";
reckenrode commented 1 year ago

Status update: Almost there! Dropping libresolv from the cross-Libsystem is a non-option. Just trying to build pkgsStatic.hello fails at c-ares and pkg-config when you do that. I’ve hacked around the failures using buildPackages.stdenv, but I’m going to need a real solution for a PR.

From what I can tell, there are four packages with issues and possible solutions:

reckenrode commented 1 year ago

I should add that I’m also building this stuff with clang 16 because I want to make sure that the clang 16 update is not blocked by any more cross-related changes. I assume clang 11 will work and will test it once I have a PR ready to submit.

reckenrode commented 1 year ago

I was wrong. I had to use pkgsBuildBuild.darwin.bootstrap_cmds not buildPackages.darwin.bootstrap_cmds. Either way, the poc seems to have succeeded. CF doesn’t (yet) support static builds, but that can be fixed. Unfortunately, dynamic Darwin cross doesn’t work yet. There’s another infinite recursion I’ll look at once I have the bootstrap stdenv stuff in place.

$ nix build .#pkgsStatic.curl --system x86_64-darwin
$ ./result-bin/bin/curl --version
curl 8.2.1 (x86_64-apple-darwin) libcurl/8.2.1 OpenSSL/3.0.10 zlib/1.2.13 zstd/1.5.5 libidn2/2.3.4 libssh2/1.11.0 nghttp2/1.54.0
Release-Date: 2023-07-26
Protocols: dict file ftp ftps gopher gophers http https imap imaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS HSTS HTTP2 HTTPS-proxy IDN Largefile libz NTLM NTLM_WB SSL threadsafe TLS-SRP UnixSockets zstd
$ otool -L ./result-bin/bin/curl
./result-bin/bin/curl:
    @rpath/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1454.90.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.60.2)
$ file ./result-bin/bin/curl
./result-bin/bin/curl: Mach-O 64-bit executable x86_64
reckenrode commented 1 year ago

As posted on Matrix just now:

$ nix build .#pkgsCross.x86_64-darwin.pkgsStatic.curl
$ ./result-bin/bin/curl --version
curl 8.2.1 (x86_64-apple-darwin) libcurl/8.2.1 OpenSSL/3.0.10 zlib/1.2.13 zstd/1.5.5 libidn2/2.3.4 libssh2/1.11.0 nghttp2/1.54.0
Release-Date: 2023-07-26
Protocols: dict file ftp ftps gopher gophers http https imap imaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS HSTS HTTP2 HTTPS-proxy IDN IPv6 Largefile libz NTLM NTLM_WB SSL threadsafe TLS-SRP UnixSockets zstd
$ file ./result-bin/bin/curl
./result-bin/bin/curl: Mach-O 64-bit executable x86_64
$ otool -L ./result-bin/bin/curl
./result-bin/bin/curl:
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.60.2)
$ nix-store -qR ./result-bin
/nix/store/1p7cpw14w6p57mhnp03rpjy65llmkg76-openssl-static-x86_64-apple-darwin-3.0.10-etc
/nix/store/9dy1s8k2zgdj788gzydn0pk02safnc01-curl-static-x86_64-apple-darwin-8.2.1-bin

That’s cross-compiled from aarch64-darwin to x86_64-darwin (i.e., native compilers not Rosetta 2). The curl is statically linked against everything except for libSystem.B.dylib including the open-source CF. I have some cleanup I need to do, but I’ve made good process. Building xnu’s headers was not easy or pretty. I’m glad every clang is a cross-compiler.

Once that’s done (fixing some of the hacks to use a reduced Libsystem instead), I want to look at Linux to Darwin if it’s not too bad. The last time I tried it (to aarch64-darwin), it was failing to link cctools.

reckenrode commented 1 year ago

I opened #256590 as draft but plan to set it to ready tomorrow once the build with clang 11 is complete.

link2xt commented 8 months ago

Just ran into this problem while trying to port Rust program builds for macOS to Nix: https://github.com/deltachat/deltachat-core-rust/pull/5326 Cross-compiling to Linux and even Windows from Linux works, but macOS builds are blocked by this issue.

n8henrie commented 6 months ago

Cross compiling from aarch64-darwin to other targets like the esp32 also seems to be broken due to this (same infinite recursion error).

EDIT: For others looking for a quick and dirty workaround in the meantime, I get successful builds pinning nixpkgs to 23.05.

reckenrode commented 1 month ago

https://github.com/NixOS/nixpkgs/pull/346043 contains the new Darwin SDK work that includes cross-compilation support from Darwin to Darwin (either architecture).

aviallon commented 1 month ago

Closed by https://github.com/NixOS/nixpkgs/pull/346043

link2xt commented 2 weeks ago

The build does not fail with infinite recursion anymore, but now I get an error "building for macOS-arm64 but attempting to link with file built for unknown-x86_64" from /nix/store/hv72hg695hkhypnym5adl8yk4ll4cqj5-x86_64-darwin-clang-wrapper-16.0.6/bin/x86_64-darwin-cc: https://github.com/deltachat/deltachat-core-rust/issues/6095#issuecomment-2480499104

lf- commented 2 weeks ago

I've seen that bug but I don't remember the cause of it. I think it was related to using the wrong linker or an old LTO bug? Maybe it made it into the lix bug tracker, maybe not.

link2xt commented 2 weeks ago

I will try to comment out all the changes to release build, like enabling LTO, and see if it helps.

EDIT: did not help.

reckenrode commented 2 weeks ago

It’s a Rust issue. Cross-compilation doesn’t seem to work right currently. That should be opened as a new issue.

link2xt commented 2 weeks ago

It’s a Rust issue. Cross-compilation doesn’t seem to work right currently. That should be opened as a new issue.

Does it also affect nixpkgs? If you know more, could you open an issue?

I am not sure if it is affecting all rust or just my usage of naersk and fenix.

reckenrode commented 2 weeks ago

Sorry, it’s a Rust in nixpkgs implementation issue. For some reason, it’s trying to link object files from different architectures when it shouldn’t be doing that.

kivikakk commented 6 days ago

I get the same issue (or very similar) trying a simple(?) crossbuild of comrak (which is about as simple a buildRustPackage as you can get; package.nix); e.g. in a nixpkgs checkout:

nix-build --option sandbox true -I nixpkgs=. --arg crossSystem '{ config = "x86_64-darwin"; }' -A comrak

The errors are something like:

= note: ld: warning: directory not found for option '-L/nix/store/26fh6nxq1x59440b0m86pf2hq6gmbhsz-clang-16.0.6-lib/x86_64-darwin/lib'
        ld: warning: ignoring file /private/tmp/nix-build-x86_64-darwin-rustc-1.82.0.drv-0/rustc-1.82.0-src/build/aarch64-apple-darwin/stage1-std/x86_64-apple-darwin/release/deps/std-2363eaa7ec35779b.std.d0532a8d8dde1e3f-cgu.05.rcgu.o, building for macOS-arm64 but attempting to link with file built for unknown-x86_64

followed by many variants of the second line, and then a long list of Undefined symbols for architecture arm64.

I wasn't able to find an existing issue, but I easily could've missed it; if no-one knows of one, I'll open one.