Open purefn opened 4 years ago
The excessive ghc-pkg
invocations would be another item to look into. Ideally we’d cache the unit-ids in nix and wouldn't have to call ghc-pkg
that often.
It would certainly be great if we could produce some flame graph from a nix build. And maybe aggregate the time spent of actual ghc invocations and the time in between. Better numbers would certainly help. Being three times as slow would indicate that we spend 2/3rds if the time not actually compiling anything.
I've been looking at the builder/make-config-files.nix
file and wondering about speeding it up a bit. If you can elaborate on some ideas I can do a bit more research into it and see if I can find an approach that works. One thing I was thinking was that rather than using ghc-pkg register
for dependencies we could just link to the libraries package configurations and then do a single ghc-pkg recache
, but I'm not sure if 1) it would help, or 2) it would even work properly.
I'd be happy to take a look into producing flame graphs. Any ideas on what tools to use to do that?
@purefn I'm afraid I don't know how about profiling tools :-/
We basically call ghc-pkg
a lot in this:
thus if we could stick the id
and name
somehow into the pkg's derivation, we wouldn't need to look it up each time.
On the other hand, if we patch cabal
to just accept a package database, and not run the solver without requireing to pass --exact-configuration
and all the other constraints, that would make it much easier as well.
Thus an optimization in nix, might be to stick the id
and name
somehow into the compuited derivations (e.g. passthru
?). But an even better solution (that would also allow us to throw out large chunks of nix code) would be to teach cabal
to not run the solver and accept a package db that contains exactly one package for each dependency.
Then again, I don't know how much time we spend in each phase. So maybe a good first step would be to have some trace marks in the shell output. I guess we could use time
on the setup build
invocations and tally the time outputs from the nix-build
log up, comparing them to the total time we spend on nix-build
. That should at least give us some idea of how long we spent on just compiling.
I had a similar thought about using time
. To start with I wrapped the runCommand
script that builds those config file with time and then timed a build, after having set my nix max jobs configuration value to 1. For my project, a complete build of exes and tests took 149m50.027s.
I tallied up the times that the make config files script ran. It took a total of 87m636s of that build time. Eliminating most of that time would bring the build time down to 2x what it takes with stack, so I definitely think this is a low hanging fruit that should be explored.
I'll see if I can get some time to look into caching more of those id
and name
values.
So we know that we spend about 90/150 or 3/5 in https://github.com/input-output-hk/haskell.nix/blob/5fa379ebc75c6f960d9f03aa857c0e0350e4dbe1/builder/make-config-files.nix#L46
That seems quite extreme! I believe it would be quite helpful to know how much time we spend in the first half: https://github.com/input-output-hk/haskell.nix/blob/5fa379ebc75c6f960d9f03aa857c0e0350e4dbe1/builder/make-config-files.nix#L47-L77
and in the second half: https://github.com/input-output-hk/haskell.nix/blob/5fa379ebc75c6f960d9f03aa857c0e0350e4dbe1/builder/make-config-files.nix#L78-L130
If the second half really dominates this, then making the exactDep
logic faster should help a lot. After talking to @phadej about this, he recommended using a haskell script instead.
We've essentially put all the libraries we want into $out/package.conf.d register
. So we could probably just load that package database and dump out all the information in the format we need it to for configure-flags
file.
The script @phadej proposed, and which just dumps the information, but doesn't do th enecessary formatting yet is:
module Main (main) where
import Distribution.InstalledPackageInfo (InstalledPackageInfo (..))
import Distribution.Pretty (prettyShow)
import Distribution.Simple.Compiler (PackageDB (..))
import Distribution.Simple.Program.HcPkg (dump, HcPkgInfo (..))
import Distribution.Simple.Program.Types (ConfiguredProgram, simpleConfiguredProgram, ProgramLocation (..))
import Distribution.Verbosity (silent)
import System.Environment (getArgs)
main :: IO ()
main = do
(ghcpkg : arg : _) <- getArgs
let cp :: ConfiguredProgram
cp = simpleConfiguredProgram "ghc-pkg" (FoundOnSystem ghcpkg)
let hcPkgInfo = HcPkgInfo cp
-- this boolean flags are irrelevant for dump
False False False False False False False False
-- ^ -- package-conf toggler for very old GHCs
ipis <- dump hcPkgInfo silent (SpecificPackageDB arg)
mapM_ (putStrLn . prettyShow . installedUnitId) ipis
to be invoked like
runghc GhcPkgScript.hs ghc-pkg /path/to/package.db
That being said, if we instead spend a massive amount of time in the first half, we may need to think about constructing the package's package-db more intelligently :-/
I have some local changes that seem promising. I first worked on the bit generating the cabal.config
, configure-flags
, and ghc-environment
. The change I made was to basically create those files for each package built as part of the install phase, then in packages that depend on those just cat ${p}/nix/configure-flags >> $out/configure-flags
, and so on.
I didn't see much improvement from doing that. There is a little, but it isn't hugely significant. So I moved onto improving how the package db was built. I ended up just doing the same thing the builtin nixpkgs Haskell infrastructure does and copying the ${p}/package.conf.d/*.conf
files of dependencies, then doing a single ghc-pkg recache
at the end. I also removed the ghc-pkg check
at the end.
The combination of these things seems to have improved the times quite a bit. I'm still in the middle of a build, but I can say that were I was seeing some packages that took more than a minute in the make-config-files part, they now only take a few seconds. Most packages spend around half a second in this part, which I think is more reasonable.
I'll post back later with numbers for a better comparison.
ghc-pkg recache
is how cabal-install
registers stuff into its store. I won't be surprised if performance is one of the motivations (another one, is that it allows registering packages with same pkg-id
but different unit-id
s there).
With the changes I mentioned above the total build time was reduced to 65m40.419s and the time that the make-config-files takes was reduced to 4m49.644s.
These numbers are much more reasonable! I want to clean a few things up and then I'll open up a PR.
@purefn so we went from ~150min to ~65min for single core compilation? Impressive! How much faster is stack
with single core compilation?
It looks like many of the changes in here have been merged?
Should this issue be closed then maybe?
If not, I'll add in a data point of stack taking 4 minutes and haskell.nix taking 12 minutes with a recent commit.
I'd be happy to debug, profile, and hack at this performance issue further but I'd need some direction on how to do that.
@codygman is this with a fresh stack build or with a warm cache? Taking 3x as much time would indicate to me that haskell.nix is building something that stack isn't? I somehow doubt it's some parallelism that stack manages to exploit?
@angerman I did a stack clean
before-hand, I can do a stack clean --full
when I'm back at the appropriate computer to be sure.
@codygman that would be helpful. I'm not that familiar with stack
; nix will however start from scratch each time to force reproducibility, except for cached libraries. If we build the same and haskell.nix manages to consume 3x more time something seems really off, and we should investigate this!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I am working with a rather large Haskell codebase. I've been working on getting it to build with nix for a few weeks. At first I tried with the Haskell infrastructure built into nixpkgs, but the build times were so bad that I had to abandon it. I decided to try out haskell.nix because it addressed some of the problems with nixpkgs, mainly the serialized library, exe, test problem. That did significantly reduce the amount of time it took to build our project, but it seems it's not enough.
With stack we're seeing approximately 20 minute build times from a clean state. With nix and haskell.nix we're seeing around 60 minute build times. That's down from what it was because I forked haskell.nix and switched over to using ld.gold for linking instead of ld.. I also made sure we're using a nixpkgs revision that includes the recent speed up improvements from avoiding subshells.
We have about 452 packages in total that get built on a clean build. Thanks to nix caching, only about half of them get built during many builds. But those builds still take around 30 minutes to complete. Is there any other low hanging fruit (other than refactoring our package structure so fewer things need to be built in an "average" build)? I've noticed that ghc-pkg runs frequently and seems to take more time than during a stack build (though I lack the numbers to back that up), is there any way of improving that?