NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.95k stars 13.97k forks source link

haskellPackages: remove dependency of $out on $doc #87267

Open Profpatsch opened 4 years ago

Profpatsch commented 4 years ago

Problem statement: If the final goal is to just produce an executable (e.g. with justStaticExecutables), then the doc outputs are completely uninteresting. However, rules_haskell still fetches the doc output of all dependencies because of a runtime dependency (see details). This is quite an overhead in practice, aka it makes CI without a permanent nix store a lot slower, because the -doc outputs have a considerable size.

Details:

The haskellPackages generic builder can split the doc output, this was added by @cleverca22 a while ago:

https://github.com/NixOS/nixpkgs/blob/7af4f066881fabddc31c41cc0194a87c3aaa02fe/pkgs/development/haskell-modules/generic-builder.nix#L80

However, the $doc output will contain the .haddock file, which is referenced by the package.conf file in $out as haddock-interfaces, and thus makes every $out depend on its $doc output at runtime.

The html documentation in doc can grow to some megabytes in size, while the .haddock file is only a small percentage of that (it’s a simple binary file). The .haddock file contains the data used e.g. by ghci’s doc command (docstrings as plain Strings and symbol names).

An example of the result/lib/ghc-8.8.3/package.conf.d/yaml-0.11.3.0-2nr6XpIyxjr5tjkYeQuXnR.conf from the haskellPackages.yaml output:

…
haddock-interfaces:
    /nix/store/an4pjhnzrw3yxbg15gwpn1vwdbp0nfrd-yaml-0.11.3.0-doc/share/doc/yaml-0.11.3.0/html/yaml.haddock

haddock-html:
    /nix/store/an4pjhnzrw3yxbg15gwpn1vwdbp0nfrd-yaml-0.11.3.0-doc/share/doc/yaml-0.11.3.0/html

The haddock-html field is used by haddock proper, it is only a build-time dependency however.

Possible Solutions:

So I’m fairly certain we can remove the haddock-html from out and only patch it back in for the build.

Con: removing the haddock-html fields from $out would destroy some of the uses ofghcWithPackages, since haddock wouldn’t be able to find the html files it can needs to link against.


An alternative approach I can think of: generate the GHC .conf into two separate outputs, one with the haddock dependencies and one without. The generic-builder would then reference the right output depending on whether doHaddock is true or false.

Con: more outputs, potentially breaks user code, so $out might have to simulate the current $out with the full config database at the previous place.


Another alternative approach is splitting the haddock phase into its own derivation.

Con: goes against the spirit of cabal, which expects things to be statefully declared in the configure phase. Con-Con: Would probably lead us to split building the Setup binary into another separate derivation, which makes the generic builder more incremental.

peti commented 4 years ago

Our discussion is available at https://www.twitch.tv/videos/615121714.

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

jtojnar commented 2 years ago

Would it be feasible to move just the conf file into a separate output as is, for now? It would still pull doc outputs when doing any compilation but at least it would not pull them for not-statically-built programs substituted from binary cache.

Not sure how common the use case is but for me, the majority of CI runs consist of executing a program that changes and needs to be rebuilt rarely. And compiling it statically just to avoid pulling in docs is annoying.

Edit: Looks like enableSharedExecutables defaults to false but the package still pulls in Haskell dependencies, so that should probably be tackled separately:

/nix/store/rp5jm8vj61l2cbji5vsy2mcjl3kzn5xp-nix-shell
└───/: …eclare -x HOST_PATH="/nix/store/72cbjdw9qs0rx8c6g9203jhb0g6vvcsb-my-hakyll-site-0.0.1/bin:…
    → /nix/store/72cbjdw9qs0rx8c6g9203jhb0g6vvcsb-my-hakyll-site-0.0.1
    ├───bin/site: …pandoc.pandoc_bindir./nix/store/23c746nqaivrb8f8jxdyj5p9czwc99z4-pandoc-2.14.0.3/bin.pandoc_libd…
    │   → /nix/store/23c746nqaivrb8f8jxdyj5p9czwc99z4-pandoc-2.14.0.3
    │   └───lib/ghc-8.10.7/package.conf.d/pandoc-2.14.0.3-1FkneAYH54w6jMExsM7ypC.conf: …dock-interfaces:.    /nix/store/5d0rkmlxyx4dvjx4lvxnvv0xja8p4>
    │       → /nix/store/5d0rkmlxyx4dvjx4lvxnvv0xja8p4482-pandoc-2.14.0.3-doc

Edit 2: Turns out the respective dependencies’ Paths modules are responsible so we need to resolve this by nuking the references like pkgs.pandoc does.