aelve / haskell-issues

An unofficial issue tracker for all things Haskell-related
18 stars 0 forks source link

Speed up builds by incorporating package build times into the build plan #54

Open sjakobi opened 7 years ago

sjakobi commented 7 years ago

When I build packages that depend on haskell-src-exts it often seems that the build of haskell-src-exts only starts once most other dependencies have finished. Then CPU utilization drops while everything waits for 2 or 3 minutes until haskell-src-exts finishes and the rest of the build can proceed.

It seems to me that a better build plan would let haskell-src-exts start much earlier to account for the fact that haskell-src-exts takes much longer to compile than most other packages.

The question is where the scheduler should get the expected (relative) build times from. The two basic possibilities seem to be to either:

Has anything similar been tried before?

neongreen commented 7 years ago

That's a nice idea. Since Hackage already has a build bot, perhaps the build times could be published on Hackage as a separate archive?

sjakobi commented 7 years ago

Pinging @ndmitchell who might know whether there's any prior work or may be aware of some concerns that need to be considered before anything of this kind can be implemented.

One aspect is that the scheduler will want to know exactly which packages are going to be built and can't be reused from previous builds. I believe that's something that stack discovers only while executing the build plan, as evidenced by the using precompiled package lines in the build log.

ndmitchell commented 7 years ago

The prior work in this area is limited, but I'd also question whether it's worthwhile... Generally you only know the time of something after you've done it once. You can sometimes get away with heuristics like having a list of "super expensive" packages and moving them as early on as possible, but much more than that tends to start bumping into how much data is available and how robustly it reproduces.

Assuming you are talking about Stack or similar I'm yet to be convinced the parallelism is optimal (even assuming no timing information), and better parallelism will generally help. The other solution might be to directly look at what in haskell-src-exts causes long build times and fixes that in GHC.

The solution Shake opts for is just to randomly order the things it does. That basically avoids the worst case most of the time, and that gets you most of the way there with no knowledge of timings.