NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.88k stars 13.94k forks source link

Overlays have incorrect `stdenv.hostPlatform` starting in NixOS 24.05 #325318

Open scottbot95 opened 3 months ago

scottbot95 commented 3 months ago

Describe the bug

Starting with NixOS 24.05, the value of prev.stdenv.hostPlatform.system will fall back to i686-linux when the actual host platform is set to x86_64-linux when building NixOS configuration.

Steps To Reproduce

Steps to reproduce the behavior:

  1. Create simple flake with nixpkgs input using at least NixOS 24.05
  2. Within a nix repl:
    1. Run: nixosSystem = module: inputs.nixpkgs.lib.nixosSystem { system = "x86_64-linux"; modules = [ module ]; }
    2. Run: nixosConfig = nixosSystem { nixpkgs.overlays = [(_final: prev: builtins.trace prev.stdenv.hostPlatform.system {})]; }
    3. Run: nixosConfig.config.system.build.toplevel
    4. Observe that trace: x86_64-linux gets printed several times, however at some point it switches to trace: i686-linux and prints that many more times

Expected behavior

All invocations of of the overlay should have stdenv.hostPlatform set correctly to the system provided in the call to nixosSystem

Additional context

From my poking around with this I have found the following:

Notify maintainers

Issue appears to be at a fair low-level within NixOS itself (or perhaps the module/overlay system?). Not sure who to put here

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

$ nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 5.15.133.1-microsoft-standard-WSL2, NixOS, 24.05 (Uakari), 24.05.20240704.c0d0be0`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.18.4`
 - channels(root): `"nixos-23.11, nixos-wsl"`
 - nixpkgs: `/home/scott/.nix-defexpr/channels/nixpkgs`

(note I have tested this on a number of different nix versions and host OSs/architecture. The only commonality seems to be what version of NixOS I am using in the flake input)


Add a :+1: reaction to issues you find important.

eclairevoyant commented 3 months ago

I'm curious about your usecase: why would you pull system from prev instead of final?

scottbot95 commented 3 months ago

I'm curious about your usecase

This is in relation to the ethereum.nix project. Specific this issue. Basically we want to create an overlay which contains all the packages from the ethereum.nix flake that match the system of the hostPlatform. Here is the current implementation (that works pre NixOS 24.05). If you have ideas for a better/more reliable approach I'd love to hear them.

https://github.com/nix-community/ethereum.nix/blob/db24350ee84b877c6d07c252bea4562af95a828c/pkgs/default.nix#L8-L11

why would you pull system from prev instead of final?

Can you? When I tried to use final instead of prev in my repro steps, I always run into infinite recursion.

scottbot95 commented 3 months ago

That said while there may be a workaround in this particular use-case, I am much more concerned about the bigger picture here. It would seem (unless I'm misunderstanding something) that NixOS is for some reason changing the hostPlatform in the middle of evaluation and that this appears to be an undocumented change introduced in NixOS 24.05 (I can't find any reference to this in the release notes).

dzmitry-lahoda commented 2 months ago

I had exact this error when doing nixos test on "nothing" (test is just "echo 42" service). Have run nix flake update and error gone.

siddharth-narayan commented 3 weeks ago

I seem to be having a very similar problem with my system (x86_64). I'm trying to use a custom openssl, so a lot of packages have to be rebuilt. I ran into this issue because the package ibm-sw-tpm2 is being built as i686-linux and is throwing the error "Ossl library is using a different radix". It seems like many packages are being built as i686, because the dependencies of ibm-sw-tpm2 are also being built that way. Currently I can't rebuild my system without removing the custom openssl.

error: builder for '/nix/store/j3q651dsnqkkirar8926fz256qa3940l-ibm-sw-tpm2-1682.drv' failed with exit code 2;
       last 10 log lines:
       > TpmToOsslMath.h:107:9: note: '#pragma message: The value of SIXTY_FOUR_BIT: SIXTY_FOUR_BIT'
       >   107 | #pragma message "The value of SIXTY_FOUR_BIT: " XSTR(SIXTY_FOUR_BIT)
       >       |         ^~~~~~~
       > TpmToOsslMath.h:108:9: note: '#pragma message: The value of SIXTY_FOUR_BIT_LONG: '
       >   108 | #pragma message "The value of SIXTY_FOUR_BIT_LONG: " XSTR(SIXTY_FOUR_BIT_LONG)
       >       |         ^~~~~~~
       > TpmToOsslMath.h:113:5: error: #error Ossl library is using different radix
       >   113 | #   error Ossl library is using different radix
       >       |     ^~~~~
       > make: *** [makefile:89: ACTCommands.o] Error 1
       For full logs, run 'nix log /nix/store/j3q651dsnqkkirar8926fz256qa3940l-ibm-sw-tpm2-1682.drv'.
error: 1 dependencies of derivation '/nix/store/1gdcnh6cj8w1k5y32ai14xgdzicn4z4f-tpm2-tss-4.1.3.drv' failed to build
error: 1 dependencies of derivation '/nix/store/b9j619q08bxgsda0hj0z82z1klbk6fni-systemd-256.4.drv' failed to build
error: 1 dependencies of derivation '/nix/store/rbhlz27s25x124yx8ljgbgpiml15a9m2-at-spi2-core-2.52.0.drv' failed to build
error: 1 dependencies of derivation '/nix/store/35wls0ms0rim93vm09qf713apbilxpcs-cups-2.4.10.drv' failed to build
error: 1 dependencies of derivation '/nix/store/vwmlwzh1jrlfcsjqw9d16hyrsagr7xvy-gamemode-1.8.2.drv' failed to build
error: 1 dependencies of derivation '/nix/store/fmmkqsw5jxjhxaj6m1jlxzjbnvyxsl54-pipewire-1.2.3.drv' failed to build
error: 1 dependencies of derivation '/nix/store/5rm1wk5z10d86qh82h0s9ai66kwwxwyp-etc-alsa-conf.d-49-pipewire-modules.conf.drv' failed to build
error (ignored): error: cannot unlink '/tmp/nix-build-git-minimal-2.46.0.drv-2/build/git-2.46.0': Directory not empty
error (ignored): error: cannot unlink '/tmp/nix-build-edk2-202408-unvendored-src.drv-3/build/source/SecurityPkg/DeviceSecurity/SpdmLib/libspdm/os_stub/openssllib/openssl/fuzz/corpora/asn1': Directory not empty
error: 1 dependencies of derivation '/nix/store/b8iihwswa5afhq01hhx0v8bijjdncv18-gamemode-1.8.2.drv' failed to build
error: 1 dependencies of derivation '/nix/store/g27gs8gcpwr6781khcig8jybra777la7-gtk+3-3.24.43.drv' failed to build
error: 1 dependencies of derivation '/nix/store/4x34kv4snya4g50i7s1klvazwkssg8jp-steam-run-usr-multi.drv' failed to build
error: 1 dependencies of derivation '/nix/store/zs3227b13yffp9fn4r4m0plc8ln0cv4m-steam-usr-multi.drv' failed to build
error: 1 dependencies of derivation '/nix/store/drnmm13klh9is8qbfs12h6dypll92119-etc.drv' failed to build
error (ignored): error: cannot unlink '/tmp/nix-build-cdrtools-3.02a09.drv-4/build/cdrtools-3.02/RULES': Directory not empty
error (ignored): error: cannot unlink '/tmp/nix-build-coreutils-full-9.5.drv-6/build/coreutils-9.5': Directory not empty
error (ignored): error: cannot unlink '/tmp/nix-build-ghc-9.6.6.drv-6/build/ghc-9.6.6-source/libraries': Directory not empty
error (ignored): error: cannot unlink '/tmp/nix-build-glslang-14.3.0.drv-3/build': Directory not empty
error (ignored): error: cannot unlink '/tmp/nix-build-boost-1.81.0.drv-6/build/boost_1_81_0/boost/fusion/container': Directory not empty
error (ignored): error: cannot unlink '/tmp/nix-build-dbus-cplusplus-0.9.0.drv-1/build/libdbus-c++-0.9.0': Directory not empty
error: 1 dependencies of derivation '/nix/store/xmdshb98l0gyml0aiarbjiliqmnvi48w-libdecor-0.2.2.drv' failed to build
error: 1 dependencies of derivation '/nix/store/ljr7xvsjvqj2dsrf0x5ml34d3dxs9k7h-steam-fhs.drv' failed to build
error: 1 dependencies of derivation '/nix/store/h3h5a4zzkjc6zf74nbf6fpxg4xr1pjdd-steam-run-fhs.drv' failed to build
error: 1 dependencies of derivation '/nix/store/7i43hwg9q8f1k8xfkgjd7m9p3n6lds1a-nixos-system-jupiter-24.11.20240919.c04d565.drv' failed to build

For some reason, /nix/store/fmmkqsw5jxjhxaj6m1jlxzjbnvyxsl54-pipewire-1.2.3.drv is the last to be i686-linux, and then /nix/store/b8iihwswa5afhq01hhx0v8bijjdncv18-gamemode-1.8.2.drv is x86_64-linux as usual.

siddharth-narayan commented 2 weeks ago

Git bisect leads me to believe that first "bad" commit that causes this issue is https://github.com/NixOS/nixpkgs/commit/0863f6d2da0be5e8d7e3fb316ef58cdfe145220b, although checking over it quickly, I don't see anything in there that would cause the issue, although I'm not an expert at Nix. I can reproduce the issue on https://github.com/NixOS/nixpkgs/commit/0863f6d2da0be5e8d7e3fb316ef58cdfe145220b, but on the previous commit, https://github.com/NixOS/nixpkgs/commit/bf6f0d3cf4e1ed8c146b4e028b80be26383f5036, there doesn't seem to be any problem.

siddharth-narayan commented 2 weeks ago

@tejing1 if you're not busy, would you be able to take a look? After looking over the commit some more, I'm still not sure what causes the issue.

tejing1 commented 2 weeks ago

stub-ld on an x86_64 system builds a stub loader for i686 as well. That means it depends on the i686 versions of stdenv, as well as unixtools.xxd. If you use an overlay that alters those in such a way that they don't work on i686, then failure is expected.

Rather than packages being built for the "wrong" system, they're being built for multiple systems, and the i686 case is failing, I believe.

To verify, you can try setting environment.ldso32 = null;. That will lose you the 32bit stub, but also avoid building any i686 versions of packages as a result of stub-ld. Take note, however, that stub-ld isn't the only thing in nixos that will depend on i686 packages. Steam, other FHS packages, and wine, I think, do so as well.

siddharth-narayan commented 2 weeks ago

In my case the package that's being built as i686 is ibm-sw-tpm2 (alongside many other packages). To me it seems like that isn't a package that should be built because of stub-ld, unless it rebuilds almost every package? On my current system, ibm-sw-tpm2 is indeed using x86_64 (and doesn't have any i686 derivations), and I'm on unstable. The only meaningful thing that changed was adding

  nixpkgs.config.packageOverrides = pkgs: rec {
    # This openssl only has x86_64? It shouldn't matter, right?
    openssl = inputs.openssl-quantum.packages.x86_64-linux.default;
  };

I used git bisect with the reproduction steps above, instead of attempting to build my own system and every package in it, because that would take forever, so it's possible my issue is different from this one. Also, environment.ldso32 = null; is set by default

tejing1 commented 2 weeks ago

curl depends on openssl. fetchers depend on curl. So basically everything in nixpkgs has a build-time dependency on openssl.

I would assume ibm-sw-tmp2 is a dependency of openssl-quantum, thus explaining why your system is trying to build it.

Furthermore, that overlay applies to the x86_64 packages, but also the i686 packages. However you're injecting an x86_64 package into it in both cases. When you throw a 64-bit openssl into a 32-bit build process, obviously things are going to go wrong.

siddharth-narayan commented 2 weeks ago

I understood what the error and why it was happening, mixing i686 and x86_64 doesn't go well obviously. What I didn't understand was why ibm-sw-tpm2 was being built as i686. But when I went to double check, there is indeed another derivation for it that is i686. So although I don't know why that's there, I think you're right that at least my problem doesn't have anything to do with stub-ld. Thanks for the help!

Now I'm wondering if this is even a real issue with nixos or maybe my problem is separate.