Open bgamari opened 11 months ago
My suspicion here is that this is either a Cabal
or Haddock
bug, although I'm not yet sure which.
Of note: something assumes the existence of dynamic
files, while there are none. Not quite sure why GHC would try to load dynamic files.
Trying to read
/nix/store/s13v3xsi60z627ic821fm70mlw43a3za-x86_64-unknown-linux-musl-ghc-9.8.1/lib/x86_64-unknown-linux-musl-ghc-9.8.1/lib/../lib/x86_64-linux-ghc-9.8.1/array-0.5.6.0-inplace/Data/Array.dyn_hi
however
/nix/store/s13v3xsi60z627ic821fm70mlw43a3za-x86_64-unknown-linux-musl-ghc-9.8.1/lib/x86_64-unknown-linux-musl-ghc-9.8.1/lib/../lib/x86_64-linux-ghc-9.8.1/array-0.5.6.0-inplace/Data
total 27K
dr-xr-xr-x 3 root root 5 Jan 1 1970 .
dr-xr-xr-x 3 root root 7 Jan 1 1970 ..
dr-xr-xr-x 6 root root 22 Jan 1 1970 Array
-r--r--r-- 2 root root 2.9K Jan 1 1970 Array.hi
-r--r--r-- 2 root root 2.9K Jan 1 1970 Array.p_hi
Maybe someone has an idea where the dyn_hi load comes from.
ping: @NixOS/static
Good to know that the hadrian regression from #208959 has been fixed, so we can at least build GHC now.
My diagnosis is the following:
haddock
executable, presumably because docs are disabled (so maybe building core lib docs and building the haddock executable is still the same flag?)haskellPackages.mkDerivation
, since I dropped the enableHaddockProgram
flag when I ported the GHC expression to Hadrian, presumably either because cross was completely broken initially or I assumed it got fixed.haddock
in scope, the one from the build->build compiler which doesn't work of course.I can fix that by just disabling haddock in the same way as we do for GHC < 9.6. I'll try doing that later.
@bgamari @angerman The question is of course, and you can answer that better than me, has anything changed w.r.t. hadddock and cross with Hadrian?
Thanks @sternenseemann! Your hypothesis does sound plausible.
Recently we did rework Haddock to take documentation from Haskell interface (.hi
) files. I can't help but wonder whether this logic may be culpable: https://gitlab.haskell.org/ghc/haddock/-/blob/b0b0e0366457c9aefebcc94df74e5de4d00e17b7/haddock-api/src/Haddock.hs#L170. This was apparently introduced due to https://github.com/haskell/haddock/issues/256.
Seems plausible. I'm personally not too fussed that this change means that haddock is not “retargetable”, i.e. you always need to use the precise haddock bundled with the GHC you are using to compile the documented code. In fact, we probably should explicitly tell Cabal which haddock to use, so this kind of issue doesn't happen or is easier to diagnose.
I'll need to investigate, though, under which circumstances we can build haddock with hadrian now.
The problems seems to be that the haddock package is only built using the stage1 compiler (so as part of stage2) which we necessarily never reach in the case of cross compilation. Presumably we can work around this in UserSettings somehow (although IME you are quite limited if your solution is to be maintainable), but I feel like this is a genuine gap and there ought to be a better way to build a cross-compiler with hadrian…
I've just skimmed the code, but why do we do this:
-- Inject dynamic-too into ghc options if the ghc we are using was built with
-- dynamic linking
flags'' <- ghc flags $ do
df <- getDynFlags
case lookup "GHC Dynamic" (compilerInfo df) of
Just "YES" -> return $ Flag_OptGhc "-dynamic-too" : flags
_ -> return flags
what's the rational for adding -dynamic-too
here? I can somewhat extract the rational from https://github.com/haskell/haddock/issues/256, but the comment above this is rather poor. Also it does not provide any way to pass to haddock to prevent this automagic.
I guess the proper thing here is to just disable haddocks for cross, and rely on native compilers haddocks.
I guess the proper thing here is to just disable haddocks for cross, and rely on native compilers haddocks.
Do you mean the native compiler's haddock
executable or re-using the documentation built natively? The former currently happens (unintentionally) and seems to be the source of the problem…
@sternenseemann
re-using the documentation built natively
This :D
It's still broken: https://github.com/domenkozar/nixpkgs-static-repo
Even easier reproducer:
nix-build -A pkgsStatic.haskell.packages.ghc98.th-orphans
error: builder for '/nix/store/dibiy3qjbg2l5ahlqf28axfqz5xw91xn-th-orphans-static-x86_64-unknown-linux-musl-0.13.14.drv' failed with exit code 1;
last 10 log lines:
> /nix/store/20rsi77ny2i4i1rbd63h4392a245j5dz-gnutar-1.35/bin/tar
> No uhc found
> Running phase: buildPhase
> Preprocessing library for th-orphans-0.13.14..
> Building library for th-orphans-0.13.14..
> [1 of 2] Compiling Language.Haskell.TH.Instances.Internal ( src/Language/Haskell/TH/Instances/Internal.hs, dist/build/Language/Haskell/TH/Instances/Internal.o )
> [2 of 2] Compiling Language.Haskell.TH.Instances ( src/Language/Haskell/TH/Instances.hs, dist/build/Language/Haskell/TH/Instances.o )
>
> <no location info>: error:
> Couldn't find a target code interpreter. Try with -fexternal-interpreter
For full logs, run 'nix log /nix/store/dibiy3qjbg2l5ahlqf28axfqz5xw91xn-th-orphans-static-x86_64-unknown-linux-musl-0.13.14.drv'.
Even easier reproducer:
nix-build -A pkgsStatic.haskell.packages.ghc98.th-orphans error: builder for '/nix/store/dibiy3qjbg2l5ahlqf28axfqz5xw91xn-th-orphans-static-x86_64-unknown-linux-musl-0.13.14.drv' failed with exit code 1; last 10 log lines: > /nix/store/20rsi77ny2i4i1rbd63h4392a245j5dz-gnutar-1.35/bin/tar > No uhc found > Running phase: buildPhase > Preprocessing library for th-orphans-0.13.14.. > Building library for th-orphans-0.13.14.. > [1 of 2] Compiling Language.Haskell.TH.Instances.Internal ( src/Language/Haskell/TH/Instances/Internal.hs, dist/build/Language/Haskell/TH/Instances/Internal.o ) > [2 of 2] Compiling Language.Haskell.TH.Instances ( src/Language/Haskell/TH/Instances.hs, dist/build/Language/Haskell/TH/Instances.o ) > > <no location info>: error: > Couldn't find a target code interpreter. Try with -fexternal-interpreter For full logs, run 'nix log /nix/store/dibiy3qjbg2l5ahlqf28axfqz5xw91xn-th-orphans-static-x86_64-unknown-linux-musl-0.13.14.drv'.
That suggests that the GHC was not built as stage2 compiler, or some of the new cross target logic prohibits native codegen now as well.
Yes, we are only building stage 1 here. As it turns out, for GHC < 9.6 we used to build the stage 2 compiler in this case, so seems like a detail I missed when porting the expression to hadrian.
Unfortunately, also building Stage 2 doesn't fix the problem according to my testing, maybe @domenkozar can confirm on #283773.
new cross target logic prohibits native codegen now as well.
Does now
refer to hadrian, ghc, or nixpkgs?
Unfortunately, also building Stage 2 doesn't fix the problem according to my testing, maybe @domenkozar can confirm on https://github.com/NixOS/nixpkgs/pull/283773.
I looked into this a little bit and the problem seems that hadrian-based builds don't build ghc-iserv
anymore, which leads to Couldn't find a target code interpreter. Try with -fexternal-interpreter
. GHC 9.4 without hadrian was still building it and thus succeeds.
I think the logic in hadrian is kind of the same as before 853c1214855e07fdb44655868532b3b6245865d4 - the full platform is compared and not something like "can execute".
Just sidestep the whole braindead install logic from hadrian. It's so bad...
Just build and install the compiler with cp.
The haskell.nix builder for GHC work around this by sidestepping hadrians build and install process and doing it a bit more explicit.
I'm not even sure Hadrian can (or should be fixed). The proper solution seems to just bin it outright and build GHC with cabal only.
I can confirm it works with haskell.nix: https://github.com/domenkozar/nixpkgs-static-repo/tree/haskell.nix
I looked into this a little bit and the problem seems that hadrian-based builds don't build
ghc-iserv
anymore, which leads toCouldn't find a target code interpreter. Try with -fexternal-interpreter
. GHC 9.4 without hadrian was still building it and thus succeeds.
As pointed out by @sternenseemann in https://github.com/NixOS/nixpkgs/pull/287794#issuecomment-1937085851, the missing ghc-iserv
is not exactly the reason for this error message, but as I mentioned in https://github.com/NixOS/nixpkgs/pull/287794#issuecomment-1937089606 probably closely related:
I think iserv can't be built, because it needs a GHCi built with -internal-interpreter, which is not built via hadrian
If we build with -finternal-interpreter
, then maybe the "Couldn't find a target code interpreter." would be solved. I am referring to this part in the hadrian source:
https://gitlab.haskell.org/ghc/ghc/-/blob/master/hadrian/src/Settings/Packages.hs#L127-156
A workaround mentioned there, could be to build the static/cross compiler with the same version of GHC as bootstrap:
-- The workaround we use is to check if the bootstrap compiler has
-- the same version as the one we are building. In this case we can
-- avoid the first step above and directly build with
-- `-finternal-interpreter`.
FTR, I tried that. To do so, I had to patch the hadrian source to allow a newer Cabal first. I then changed the bootPkgs to use ghc981 when building cross:
bootPkgs =
if stdenv.hostPlatform != stdenv.targetPlatform then
buildPackages.haskell.packages.ghc981
else
packages.ghc947;
The build then fails with a lot of this:
ghc/Main.hs:18:1: error: [GHC-53693]
Something is amiss; requested module ghc-9.8.1:GHC differs from name found in the interface file ghc:GHC (if these names look the same, try again with -dppr-debug)
I have not idea what that means and haven't gone further, yet. Just putting this here in case somebody has an idea.
You can explicitly build iserv using hadrian. It's not a default target for some reason. And then sidestep the broken install phase of hadrian. You don't need any of this anyway with nix as you a priori know all your install locations. So you can replace the convoluted install phase with a simple cp
.
EDIT: just use the same logic we have in Haskell.nix, it should be translatable to the nixpkgs GHC builder: https://github.com/NixOS/nixpkgs/issues/275304#issuecomment-1915762455, both builders are still fairly similar.
Just confirmed this is still a problem with GHC 9.10.1.
The build then fails with a lot of this:
This is a hadrian bug, there's apparently a patch on GHC master (9.10?). That being said, self bootstrapping isn't exactly tested upstream and has become annoying with hadrian due to the strict bounds.
Just confirmed this is still a problem with GHC 9.10.1.
To be precise: I tested pkgsStatic.haskell.packages.ghc9101.th-orphans
still fails with the above external interpreter error message.
I did not test, at least yet, the self-bootstrapping approach I tried earlier with GHC 9.8. That might still be worth a try.
Good to know that the hadrian regression from #208959 has been fixed, so we can at least build GHC now.
Where/when was it fixed? I'd like to patch the ghc I'm using so I'd need to know which commit fixed this.
EDIT: Mistake, ignore.
With #208959 fixed, I now get the same external-interpreter error when building th-orphans
for both GHC 9.6 and 9.8. This makes sense, because it's because of the hadrian build, which both use. So this issue really applies to both now.
Edit: And as mentioned above for GHC 9.10 as well. So basically for GHC 9.6 up.
Just confirmed this is still a problem with GHC 9.10.1.
To be precise: I tested
pkgsStatic.haskell.packages.ghc9101.th-orphans
still fails with the above external interpreter error message.I did not test, at least yet, the self-bootstrapping approach I tried earlier with GHC 9.8. That might still be worth a try.
I was able to successfully bootstrap GHC 9.10.1 for pkgsStatic from GHC 9.10.1 itself this time.
It still doesn't solve the problem at hand, though:
<no location info>: error:
Couldn't find a target code interpreter. Try with -fexternal-interpreter
All still the same.
The original case in this issue https://github.com/NixOS/nixpkgs/issues/275304#issue-2047720447
nix build nixpkgs#legacyPackages.x86_64-linux.pkgsStatic.haskell.packages.ghc98.Diff
succeeds for me, now.
Agree with https://github.com/NixOS/nixpkgs/issues/275304#issuecomment-2120806285;
nix build nixpkgs#legacyPackages.x86_64-linux.pkgsStatic.haskell.packages.ghc9101.th-orphans
produces the error.
Additional example:
nix build nixpkgs#legacyPackages.x86_64-linux.pkgsCross.aarch64-multiplatform.haskell.packages.ghc9101.th-orphans
produces the error
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/build-aarch64-docker-image-on-amd64-machine/55052/5
These also produce the error 9.6
nix build nixpkgs/haskell-updates#legacyPackages.x86_64-linux.pkgsStatic.haskell.packages.ghc96.th-orphans_0_13_15
9.8
nix build nixpkgs/haskell-updates#legacyPackages.x86_64-linux.pkgsStatic.haskell.packages.ghc98.th-orphans_0_13_15
@bgamari, can we change the title to "GHC 9.6, 9.8, 9.10 TemplateHaskell doesn't work in pkgsStatic, pkgsCross"?
Describe the bug
Haskell packages in
nixpkgs.pkgsStatic.haskell.packages.ghc98
are unable to be built.Steps To Reproduce
Steps to reproduce the behavior:
nix build nixpkgs#legacyPackages.x86_64-linux.pkgsStatic.haskell.packages.ghc98.Diff
Expected behavior
Diff
is built, linking againstmusl
.Observed behavior
Additional context
The problem here appears to manifest during building of Haddock documentation. For instance,
Notify maintainers
@nh2