NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.99k stars 14k forks source link

haskell packages with separate bin outputs fail due to a reference cycle on aarch64-darwin #140774

Closed walkah closed 1 year ago

walkah commented 3 years ago

Describe the bug

niv fails to build on aarch64-darwin with the following errors:

Linking dist/build/niv/niv ...
running tests
Running 1 test suites...
Test suite unit: RUNNING...
Test suite unit: PASS
Test suite logged to: dist/test/niv-0.2.19-unit.log
1 of 1 test suites (1 of 1 test cases) passed.
haddockPhase
installing
Installing library in /nix/store/v5p7lliij8f10c920sa8lwnqv8di6007-niv-0.2.19/lib/ghc-8.10.7/aarch64-osx-ghc-8.10.7/niv-0.2.19-HYXnAgZhF2cLmqAJFPpkgw
Installing executable niv in /nix/store/abawbski981061gbsnq0v5374gvsrbq7-niv-0.2.19-bin/bin
Warning: The directory
/nix/store/abawbski981061gbsnq0v5374gvsrbq7-niv-0.2.19-bin/bin is not in the
system search path.
/nix/store/mp5xsfnvdbdyqjmcipyks4nph534jc3q-cctools-binutils-darwin-949.0.1/bin/strip: changes being made to the file will invalidate the code signature in: /nix/store/abawbski981061gbsnq0v5374gvsrbq7-niv-0.2.19-bin/bin/niv
Registering library for niv-0.2.19..
post-installation fixup
strip is /nix/store/vf55lp3k5xgx1vlkysb7pbw76f1wh21l-clang-wrapper-11.1.0/bin/strip
stripping (with command strip and flags -S) in /nix/store/v5p7lliij8f10c920sa8lwnqv8di6007-niv-0.2.19/lib
patching script interpreter paths in /nix/store/v5p7lliij8f10c920sa8lwnqv8di6007-niv-0.2.19
strip is /nix/store/vf55lp3k5xgx1vlkysb7pbw76f1wh21l-clang-wrapper-11.1.0/bin/strip
patching script interpreter paths in /nix/store/hkapz2a124mmzc3x1mw2dqvds3bhkfg4-niv-0.2.19-data
strip is /nix/store/vf55lp3k5xgx1vlkysb7pbw76f1wh21l-clang-wrapper-11.1.0/bin/strip
stripping (with command strip and flags -S) in /nix/store/abawbski981061gbsnq0v5374gvsrbq7-niv-0.2.19-bin/bin
patching script interpreter paths in /nix/store/abawbski981061gbsnq0v5374gvsrbq7-niv-0.2.19-bin
cycle detected in the references of '/nix/store/abawbski981061gbsnq0v5374gvsrbq7-niv-0.2.19-bin' from '/nix/store/v5p7lliij8f10c920sa8lwnqv8di6007-niv-0.2.19'
error: build of '/nix/store/k1jbc0389f58cwwy9xvi9r2xi5fmqdc2-niv-0.2.19.drv' failed

Recent hydra builds also fail: https://hydra.nixos.org/job/nixpkgs/nixpkgs-unstable-aarch64-darwin/niv.aarch64-darwin

Steps To Reproduce

Steps to reproduce the behavior:

  1. nix-shell -p niv

Expected behavior

niv should install and be available.

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
 - system: `"aarch64-darwin"`
 - host os: `Darwin 20.6.0, macOS 11.6`
 - multi-user?: `yes`
 - sandbox: `no`
 - version: `nix-env (Nix) 2.3.15`
 - channels(walkah): `"darwin, home-manager"`
 - channels(root): `"nixpkgs-21.11pre320922.ee084c02040"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixpkgs`

Maintainer information:

# a list of nixpkgs attributes affected by the problem
attribute:
# a list of nixos modules affected by the problem
module:
angerman commented 2 years ago

To reiterate my point: relying on magic/heuristics in the tail end of the pipeline to paper over sloppiness somewhere in the front and middle part of the pipeline is just bad engineering.

sternenseemann commented 2 years ago

We package GHC and last time I checked it is far from our responsibility to “Fix the broken code from the get go would be”. We merely rely on behavior GHC plus whatever linker exhibit for all decently supported targets.

hexagonal-sun commented 2 years ago

I've now got a patch that removes the hard-coded paths and just relies on environment variables for finding the symbols exposed by Paths_. As a compromise, we could use that without any modificaiton to the generic builder and just accept the fact that Haskell packages that use Data-Files in their cabal files will be broken (which AFAICS isn't that many), but at least we can get packages built. Note: this approach does allow ghcid to build and function on my machine.

sternenseemann commented 2 years ago

Honestly then it's still a better solution to disable separate outputs completely for that target.

hexagonal-sun commented 2 years ago

Honestly then it's still a better solution to disable separate outputs completely for that target.

Agreed.

Profpatsch commented 2 years ago

I'm fairly certain we should add a flag to cabal to not generate Path's modules if not explicitly requested. It's an absolute misfeature, and one of the few places that completely break relocatability.

I’ve been wanting to patch cabal so that it only generates the version symbol in Paths by default, and then have overrides in nixpkgs that we can selectively enable for packages that also need to access the symbols which expose paths like libdir. The we won’t run into these accidentally in the future.

Maybe somebody else wants to do that?

hexagonal-sun commented 2 years ago

I was working on that approach, similar to what @angerman suggested. I'm removing bits of the Path_ module that hard-code paths and get these values via env-vars. Then use wrappers around any binaries that are outputted to set the appropriate environment.

hexagonal-sun commented 2 years ago

@Profpatsch Although, one major downside is that we'd have to patch every new version of GHC to remove the bits of the Path_ module generation that we don't want.

Profpatsch commented 2 years ago

iirc pandoc is one of the packages depending heavily on data-dirs, and we certainly wouldn’t want to break that.

So one way or another, we are going to have to patch cabal to make Paths_* generation configurable (defaulting to off in nixpkgs and to on in cabal itself).

Profpatsch commented 2 years ago

@Profpatsch Although, one major downside is that we'd have to patch every new version of GHC to remove the bits of the Path_ module generation that we don't want.

Pretty sure that module hasn’t changed in a few years.

angerman commented 2 years ago

We can always try to upstream stuff. Getting stuff in GHC should be fairly easy, getting stuff into Cabal might require a bit more cajoling.

I still think Version should be set via cabal_macros, e.g. made available as some VERSION value. We already have VERSION values in the cabal macros. It might already be in there. So just use CPP + the version value from the cabalmacros, instead of reaching into the Paths module. Again Paths_ module is wrong on so many levels.

angerman commented 2 years ago

Here is an example application that does exactly that without any of the Paths_ junk.

https://github.com/zw3rk/cabal-version-example

angerman commented 2 years ago

Just for completeness sake: this is what's in the cabal_macros.h

/* DO NOT EDIT: This file is automatically generated by Cabal */

/* package cabal-version-example-0.1.0.0 */
#ifndef VERSION_cabal_version_example
#define VERSION_cabal_version_example "0.1.0.0"
#endif /* VERSION_cabal_version_example */
#ifndef MIN_VERSION_cabal_version_example
#define MIN_VERSION_cabal_version_example(major1,major2,minor) (\
  (major1) <  0 || \
  (major1) == 0 && (major2) <  1 || \
  (major1) == 0 && (major2) == 1 && (minor) <= 0)
#endif /* MIN_VERSION_cabal_version_example */
/* package base-4.14.3.0 */
#ifndef VERSION_base
#define VERSION_base "4.14.3.0"
#endif /* VERSION_base */
#ifndef MIN_VERSION_base
#define MIN_VERSION_base(major1,major2,minor) (\
  (major1) <  4 || \
  (major1) == 4 && (major2) <  14 || \
  (major1) == 4 && (major2) == 14 && (minor) <= 3)
#endif /* MIN_VERSION_base */

/* tool gcc-4.2.1 */
#ifndef TOOL_VERSION_gcc
#define TOOL_VERSION_gcc "4.2.1"
#endif /* TOOL_VERSION_gcc */
#ifndef MIN_TOOL_VERSION_gcc
#define MIN_TOOL_VERSION_gcc(major1,major2,minor) (\
  (major1) <  4 || \
  (major1) == 4 && (major2) <  2 || \
  (major1) == 4 && (major2) == 2 && (minor) <= 1)
#endif /* MIN_TOOL_VERSION_gcc */
/* tool ghc-8.10.7 */
#ifndef TOOL_VERSION_ghc
#define TOOL_VERSION_ghc "8.10.7"
#endif /* TOOL_VERSION_ghc */
#ifndef MIN_TOOL_VERSION_ghc
#define MIN_TOOL_VERSION_ghc(major1,major2,minor) (\
  (major1) <  8 || \
  (major1) == 8 && (major2) <  10 || \
  (major1) == 8 && (major2) == 10 && (minor) <= 7)
#endif /* MIN_TOOL_VERSION_ghc */
/* tool ghc-pkg-8.10.7 */
#ifndef TOOL_VERSION_ghc_pkg
#define TOOL_VERSION_ghc_pkg "8.10.7"
#endif /* TOOL_VERSION_ghc_pkg */
#ifndef MIN_TOOL_VERSION_ghc_pkg
#define MIN_TOOL_VERSION_ghc_pkg(major1,major2,minor) (\
  (major1) <  8 || \
  (major1) == 8 && (major2) <  10 || \
  (major1) == 8 && (major2) == 10 && (minor) <= 7)
#endif /* MIN_TOOL_VERSION_ghc_pkg */
/* tool haddock-2.24.2 */
#ifndef TOOL_VERSION_haddock
#define TOOL_VERSION_haddock "2.24.2"
#endif /* TOOL_VERSION_haddock */
#ifndef MIN_TOOL_VERSION_haddock
#define MIN_TOOL_VERSION_haddock(major1,major2,minor) (\
  (major1) <  2 || \
  (major1) == 2 && (major2) <  24 || \
  (major1) == 2 && (major2) == 24 && (minor) <= 2)
#endif /* MIN_TOOL_VERSION_haddock */
/* tool hpc-0.68 */
#ifndef TOOL_VERSION_hpc
#define TOOL_VERSION_hpc "0.68"
#endif /* TOOL_VERSION_hpc */
#ifndef MIN_TOOL_VERSION_hpc
#define MIN_TOOL_VERSION_hpc(major1,major2,minor) (\
  (major1) <  0 || \
  (major1) == 0 && (major2) <  68 || \
  (major1) == 0 && (major2) == 68 && (minor) <= 0)
#endif /* MIN_TOOL_VERSION_hpc */
/* tool hsc2hs-0.68.7 */
#ifndef TOOL_VERSION_hsc2hs
#define TOOL_VERSION_hsc2hs "0.68.7"
#endif /* TOOL_VERSION_hsc2hs */
#ifndef MIN_TOOL_VERSION_hsc2hs
#define MIN_TOOL_VERSION_hsc2hs(major1,major2,minor) (\
  (major1) <  0 || \
  (major1) == 0 && (major2) <  68 || \
  (major1) == 0 && (major2) == 68 && (minor) <= 7)
#endif /* MIN_TOOL_VERSION_hsc2hs */
/* tool runghc-8.10.7 */
#ifndef TOOL_VERSION_runghc
#define TOOL_VERSION_runghc "8.10.7"
#endif /* TOOL_VERSION_runghc */
#ifndef MIN_TOOL_VERSION_runghc
#define MIN_TOOL_VERSION_runghc(major1,major2,minor) (\
  (major1) <  8 || \
  (major1) == 8 && (major2) <  10 || \
  (major1) == 8 && (major2) == 10 && (minor) <= 7)
#endif /* MIN_TOOL_VERSION_runghc */

#ifndef CURRENT_COMPONENT_ID
#define CURRENT_COMPONENT_ID "cabal-version-example-0.1.0.0-inplace-cabal-version-example"
#endif /* CURRENT_COMPONENT_ID */
#ifndef CURRENT_PACKAGE_VERSION
#define CURRENT_PACKAGE_VERSION "0.1.0.0"
#endif /* CURRENT_PACKAGE_VERSION */
hexagonal-sun commented 2 years ago

Agreed that approach seems the most sensible going forward, however that would be a breaking change for most current packages, right?

What's the best approach to solve this particular issue? I think we've got two options:

sternenseemann commented 2 years ago

Look at eliminating the paths symbols and set them via environment variables (I have a patch that's nearly there for this).

That's an oversimplification. It is irrelevant whether you use env vars or not, you rather need to somehow disable emitting references (to at least out) only if they are not needed. Using the env var approach would actually make it impossible to catch a omitted reference needed by a package at build time.

The funny thing is that, with how things currently are (on non-aarch64-darwin), we get this for free. With the proposed change we need to work really hard to figure everything out. A robust solution would in theory be to convert getDataDir etc. to macros, so unused paths would never leak into an object file, but that's not really possible to do in a non breaking fashion.

hexagonal-sun commented 2 years ago

you rather need to somehow disable emitting references (to at least out) only if they are not needed

Right, I see. Since the env-var approach makes no distinction whether the reference is used or not, it's always set. Whereas we get this 'for free' at the moment with DCE?

sternenseemann commented 2 years ago

Right, I see. Since the env-var approach makes no distinction whether the reference is used or not, it's always set. Whereas we get this 'for free' at the moment with DCE?

Exactly. I think we could make a hack in the Paths_* module generation, using that the only "problematic" reference is out and the prefix is always set to out: We could not emit the get* things for paths that are equal to or below prefix if bindir is not below prefix. In theory, it should not emit any problematic references anymore and fail if there would be a reference cycle with or without DCE.

But this is a hack, really, and specific to nixpkgs.

hexagonal-sun commented 2 years ago

Sounds good. I'll try and implement something in the next few days.

angerman commented 2 years ago

Summary:

What we really want:

Anyway, I'll leave it at that, but maybe that's helpful for someone reading just the end(?) of this thread.

hexagonal-sun commented 2 years ago

I've updated #154046 with @sternenseemann's suggestion. This fixes the build for all the problematic programs listed here.

Profpatsch commented 2 years ago
  • there should be a library that tries to locate that library relative to the executable on platforms that support this (e.g. in the build tree structure that cabal produces)

I don’t think this is an actually viable solution to be honest. Bazel does something similar with their runfiles libraries, which means they have to implement the same library for every single language (just search the bazel repo for runfiles.

And they don’t have the problem of having different file setups (which would break Haskell’s neck since every distro can make arbitrary choices and heuristics suck!

angerman commented 2 years ago

@Profpatsch ghc has relative lookup code in its codebase. You can use ghc without the retarded wrapper files on most architectures as long as you keep bin and lib sibling directories. If you don't, you need the wrapper.

I did not intent a general library for relative lookup. I meant a haskell library, to prevent people from implementing the same functionality in broken ways over and over again and to enforce some coherence.

Gau-thier commented 2 years ago

Regarding the pull requests https://github.com/NixOS/nixpkgs/pull/147964 and https://github.com/NixOS/nixpkgs/pull/154046 this will be fixed soon.

But I am curious to understand how to implement this workaround:

Is there any workaround?

https://github.com/srid/haskell-template/blob/6ecc41c90bc063a4649694533090982bfa5ca47b/flake.nix#L19-L30

I have a basic shell.nix provided on my project, and I don't have any clue on how to do this fix... Any help/explanation would be nice.

Gau-thier commented 2 years ago

Why is this issue closed? I am still having the trouble (on niv for example):

error: cycle detected in build of '/nix/store/gsm34npdhjpzn1kl7drd1k118mmwvqg5-niv-0.2.19.drv' in the references of output 'bin' from output 'out'

sternenseemann commented 2 years ago

Update nixpkgs, according to Hydra this is no longer an issue on master.

sternenseemann commented 2 years ago

As mentioned in the PR with the fix, the patch for Cabal will need to be ported to GHC 9.0.2 (which also uses the LLVM-based aarch64-darwin backend), as that has become the default Haskell compiler on master now.

Anyone interested in implementing and testing the change?

sternenseemann commented 2 years ago

@hexagonal-sun The default version is definitely 9.0.2, you seem to be using the (stale) nixpkgs flake instead of the local checkout you updated.

hexagonal-sun commented 2 years ago

Oops, my bad. Just noticed that! Indeed it is the right version!

hexagonal-sun commented 2 years ago

Good news, I've just had a look and the patch to PathsModule.hs appliles cleanly on 9.0.2 so it looks like we'll be able to reuse the same patch. I'm just building now to confirm.

hexagonal-sun commented 2 years ago

GHC 9.0.2 and ormolu compiled with no problems. PR is in #167895.

srid commented 2 years ago

FTR, this issue reoccurs in pkgs.haskell.packages.ghc923's ghcid, etc.

sternenseemann commented 2 years ago

@srid Yes, the patch by @hexagonal-sun hasn't been applied to 9.2.3. Feel free to try if it applies, test the resulting GHC derivation and create a PR.

sloane-shark commented 2 years ago

I played around with the patch on 9.2.3, and it seems like it will need some reworking. Specifically, this commit changed the implementation in Cabal to use some jinja-like templating library, so at the very least the logic needs to be translated to the appropriate lines of this file. I might take a crack at this over the weekend.

e: wonder if it would be worthwhile reopening this issue or opening a new one to track the ghc923 failure, since it won't be totally trivial

sloane-shark commented 2 years ago

I made an initial attempt to rework the patch here. Unfortunately, I think I messed something up, as building .#haskell.packages.ghc923.ghcid worked, but no bin output was included. Sharing this here in case anyone can clean this up and get it working quicker than I can.

e: hold on, i think i made a mistake, i should have tried .#haskell.packages.ghc923.ghcid.bin, which works! .#haskell.packages.ghc923.ormolu also works. I'll see if I can get a PR up today.

jiegec commented 1 year ago

The issue reoccurs: https://hydra.nixos.org/build/203059066

sternenseemann commented 1 year ago

Yes, we'll have to finish and merge #184041 which requires a completely new patch to Cabal.

srid commented 1 year ago

For those wondering, the workaround meanwhile is to override your haskell package set as follows:

            let
              # Workaround for https://github.com/NixOS/nixpkgs/issues/140774
              fixCyclicReference = drv:
                pkgs.haskell.lib.overrideCabal drv (_: {
                  enableSeparateBinOutput = false;
                });
            in
            self: super: {
              ghcid = fixCyclicReference super.ghcid;
              haskell-language-server = super.haskell-language-server.overrideScope (lself: lsuper: {
                ormolu = fixCyclicReference super.ormolu;
              });
            }

See https://github.com/srid/haskell-template/pull/79

domenkozar commented 1 year ago

On linux this works because dead code elimination takes care of pruning unused constants in Paths_

On macOS it's most likely because of not enough DCE, I've only found https://gitlab.haskell.org/ghc/ghc/-/issues/11040 in ghc about this.

cc @angerman @bgamari for help

balacij commented 1 year ago

Building Haskell Language Server with GHC 9.2.5 runs into the same issue. Does your reference cycle patch still work @srid? Maybe I'm doing something wrong :sweat_smile:

  my-hls = pkgs.haskell-language-server.override {
    supportedGhcVersions = ["925"];
  };

The overlay (imported by my configuration.nix), trying to use @srid's code:

{pkgs, ...}: let
  # Workaround for https://github.com/NixOS/nixpkgs/issues/140774
  fixCyclicReference = drv:
    pkgs.haskell.lib.overrideCabal drv (_: {
      enableSeparateBinOutput = false;
    });
in {
  nixpkgs.overlays = [
    (self: super: {
      self.haskell.packages.ghc925.ghcid = fixCyclicReference super.haskell.packages.ghc925.ghcid;
      self.haskell.packages.ghc925.haskell-language-server = super.haskell.packages.ghc925.haskell-language-server.overrideScope (lself: lsuper: {
        ormolu = fixCyclicReference super.haskell.packages.ghc925.ormolu;
        cabal-fmt = fixCyclicReference super.haskell.packages.ghc925.cabal-fmt;
      });
      self.haskell.packages.ghc925.ormolu = fixCyclicReference super.haskell.packages.ghc925.ormolu;
      self.haskell.packages.ghc925.cabal-fmt = fixCyclicReference super.haskell.packages.ghc925.cabal-fmt;
    })
  ];
}
/nix/store/3yfs41f4b60jya2gk6xikx4s97zsxjr0-stdenv-linux/setup: line 1578:   487 Killed                  ./Setup haddock --html --hoogle --quickjump --hyperlink-source
error: builder for '/nix/store/awvc6kdfxlv9l904gjl6kbv56slw6z49-hls-cabal-fmt-plugin-0.1.0.0.drv' failed with exit code 137
error: 1 dependencies of derivation '/nix/store/jc87f5ihxp3l4kx731ypnykdnxznq3m6-haskell-language-server-1.9.0.0.drv' failed to build
error: 1 dependencies of derivation '/nix/store/ncb3yayx43mwsfvb7wg7q37l5nxhwb3g-haskell-language-server-1.9.0.0.drv' failed to build
error: 1 dependencies of derivation '/nix/store/9dk674w9pvb9j6fwzm5f25z79jard8kd-system-path.drv' failed to build
error: 1 dependencies of derivation '/nix/store/2fv01z15fns1bvh9gc4is4s0rxqfvnxj-nixos-system-nixps-23.05pre452935.724bfc08923.drv' failed to build
 - system: `"x86_64-linux"`
 - host os: `Linux 6.1.10, NixOS, 23.05 (Stoat), 23.05pre452326.fab09085df1`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.13.2`
 - channels(root): `"home-manager, nixos"`
 - channels(jasonbalaci): `"home-manager"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`
srid commented 1 year ago

@balacij "Killed" in the error message suggests that the kernel killed the build process? Out of memory?


By the way, for anyone that uses haskell-flake, I have just created a module making it easier to incorporate the workaround for this bug in their Haskell projects: https://github.com/srid/nixpkgs-140774-workaround

🚀

balacij commented 1 year ago

Hi @srid, thank you. Yes, it seemingly continuously rebuilds the plugins for haskell-language-server until my computer runs out of memory and kills the build.

sloane-shark commented 1 year ago

I revived/reworked the PR to workaround this for 9.2.x releases: #216857

shinzui commented 1 year ago

@matthewess Does this fix also work on 9.4.x?

sloane-shark commented 1 year ago

@shinzui I haven't tested it myself, you could check out my PR and try applying the patch to the appropriate configuration-ghc-9.4.x.nix file or whatnot

adrian-gierakowski commented 1 year ago

🎉