Open 23Skidoo opened 12 years ago
This will be a huge win if it can make effective use of all cores. I've had quite a few multi-minute builds of individual packages, where the newly added per-package parallelism only helps with dependencies during the very first build, but not at all during ongoing development.
@bos The main obstacle here is reloading of interface files, which slows down the parallel compilation considerably compared to ghc --make
. See e.g. Neil Mitchell's Shake paper, where he found that "building the same project with ghc --make
takes 7.69 seconds, compared to Shake with 11.83 seconds on one processor and 7.41 seconds on four processors." So far, the most promising approach seems to be implementing a "compile server" mode for GHC.
An e-mail from @dcoutts that describes the "compile server" idea in more detail:
So here's an idea I've been mulling over recently...
For IDEs and build tools, we want a ghc api interface where we have very
explicit control over the environment in which new modules are compiled.
We want to be in full control, not using --make, and not using any
search paths etc. We know exactly where each .hi and .o file for all
dependent modules are. We should be able to build up an environment of
module name to (interface, object code) by starting from empty, adding
packages and individual module (.hi, .o) files.
Now that'd give us an api a lot like the current command line interface
of ghc -c single shot mode, except that we would be able to specify .hi
files on the command line rather than having ghc find them by searching.
But once we have that api, it'll be useful for IDEs, and useful for a
ghc server. This should give us the performance advantages of ghc --make
but still give us the control and flexibility of single shot mode. I'll
come to parallel builds in a moment.
The way it'd work is you start the server with some initial environment
(e.g. the packages) and you tell it to compile a module, then you can
tell it to extend its environment e.g. with the module you just compiled
and use the extended environment to compile more modules. So clearly you
could do the same thing as ghc --make does but with the dependency
manager being external to ghc.
Now for parallelism. Suppose we have two cores. We launch two ghc server
processes with the same initial package environment. We start compiling
two independent modules. Now we load the .hi files into *both* ghc
server processes to compile more modules. (In practice we don't load
them into each server when they become available, rather we do it on
demand when we see the module we need to compile needs the module
imports in question based on our module dep graph).
So, a short analysis of the number of times that .hi files are loaded:
In the current ghc --make mode, each .hi file is loaded once. So let's
say M modules. In the current ghc -c mode, for M modules we're loading
at most m * m/2 modules (right?) because in a chain of M modules we have
to load all previous .hi files for each ghc -c invocation.
In the hypothetical ghc server mode, with N servers, the worst case is
something like M * N module loads. Also, the N is parallelised. So the
single threaded performance is the same as --make. If you use 8 cores,
the overhead is 8 times higher in total, but distributed across 8 cores
so the wall clock time is no worse.
Actually, it's probably more sensible to look not at the cost of loading
the .hi files for M modules, but for P packages which is likely the
dominant cost. Again, it's P cost for the --make mode, and M * P for the
ghc -c mode, but N * P for the server mode. So this means it might not
be necessary to do the whole-package .hi file optimisation since the
cost is dramatically reduced.
So overall then, there's two parts to the work in ghc: extend the ghc
api to give IDEs and build managers this precise control over the
environment, then extend the main ghc command line interface to use the
new ghc api feature by providing a --server mode. It'd accept inputs on
stdin or something. It only needs very minimal commands: extend the
environment with a .hi .o pair and compile a .hs file. You can assume
that packages and other initial environment things are specified on the
--server command line.
Finally if there's time, add support for this mode into cabal, but that
might be too much (since that needs a dependency based build manager).
I'll also admit an ulterior motive for this feature, in addition to use
in cabal, which is that I'm working on Visual Studio integration and so
I've been thinking about what IDEs need in terms of the ghc api and I
think very explicit control of the environment is the way to go.
Even though using ghc -c
leads to a slowdown on one core, having it as an option (for people with more cores) in the meantime seems worthwhile to me.
@tibbe, I thought the point was that ghc -c
doesn't break even until 4 cores. Mind you, Neil was surely testing on Windows, where the OS and filesystem could be reasonably expected to hurt performance quite severely.
@bos I've heard the number 2 tossed around as well, but we should test and see. Doing parallelism at the module level should also expose many more opportunities for parallelism. The current parallel build system suffers quite a bit from lack of that (since there are lots of linear chains of package dependencies.)
What about profiling builds? Due to the structure of the compilations (exactly the same things as in a normal compilaiton are built), I'd guess might easily be run in parallel, and we might get almost ~x2 time saved.
@nh2 Parallel cabal build
will make this possible.
I am currently working on this. I got good results with ghc-parmake for compiling large libraries and am now making executables build in parallel.
@nh2 Cool! BTW, I proposed this as a GSoC project for this summer. Maybe we can work together if my project gets accepted?
@nh2
I got good results with ghc-parmake for compiling large libraries
I'm interested in the details. How large was the speedup? On how many cores? In my testing, the difference was negligible.
How large was the speedup? On how many cores?
The project I'm working on has a library with ~400 modules and 40 executables. I'm using an i7-2600K with 4 real (8 virtual) cores. For building the library only, I get:
* cabal build: 4:50 mins
* cabal build --with-ghc=ghc-parmake --ghc-options="-j 2": 4:20 mins
* cabal build --with-ghc=ghc-parmake --ghc-options="-j 4": 3:00 mins
* cabal build --with-ghc=ghc-parmake --ghc-options="-j 8": 2:45 mins
I had to make minimal changes to ghc-parmake to get this to work, and thus got a 2x speedup almost for free :)
As you can see, the speed-up is not as big as we can probably expect from ghc --make
itself being parallel or your --server
- due to the caching, those should be a good bit faster, and I hope your project gets accepted. I'd be glad to help a bit if I can - but while I'm ok with hacking around on cabal, I've never touched GHC.
Building the executables in parallel is independent from all this and will also probably be a small change.
* cabal build: 4:50 mins * cabal build --with-ghc=ghc-parmake --ghc-options="-j 2": 4:20 mins * cabal build --with-ghc=ghc-parmake --ghc-options="-j 4": 3:00 mins * cabal build --with-ghc=ghc-parmake --ghc-options="-j 8": 2:45 mins
Nice to hear that it can give a noticeable speedup on large projects. I should try testing it some more.
Building the executables in parallel is independent from all this and will also probably be a small change.
Maybe if you don't integrate build -j
and install -j
. Then you won't need to implement the IPC design sketched above.
@23Skidoo I made a prototype at https://github.com/nh2/cabal/compare/build-executables-in-parallel. It would be nice if you could take a look.
Semaphore
and JobControl
from cabal-install
is not so nice. Is that the way to go nevertheless or should they be moved to some Internal
package in Cabal
? Update: We are discussing that here.MIN_VERSION_base
) in the Cabal
package - is that correct? The way how I work around it is very ugly (just using the deprecated old functions in Exception
, creating warnings).--jobs
.Feedback appreciated.
I have updated my branch to fix some minor bugs in my code. I can now build my project with cabal build --with-ghc=ghc-parmake --ghc-options="-j 8" -j8
to get both parallel library compilation and parallel executable building.
The questions above still remain.
@nh2 Thanks, I'll take look.
@nh2
The copying of Semaphore and JobControl from cabal-install is not so nice. Is that the way to go nevertheless or should they be moved to some Internal package in Cabal?
Can't you just export them from Cabal
and remove the copies in cabal-install
?
It looks like I can't use macros (need MIN_VERSION_base) in the Cabal package - is that correct?
Yes, this doesn't work because of bootstrapping. You can do this, however:
#if !defined(VERSION_base)
-- we're bootstrapping, do something that works everywhere
#else
#if MIN_VERSION_base(...)
...
#else
...
#endif
#endif
Or maybe we should add a configure
script.
Yes, this doesn't work because of bootstrapping. You can do this, however
Good idea, but when we do the something that works everywhere
, we will still get the warnings, this time only in one of the two phases.
Or maybe we should add a configure script.
If that would be enough to find out the version of base
, that sounds like the better solution. I don't know what the reliable way to find that out is, though.
I have another idea - since Cabal only supports building on GHC nowadays, you can use
#if __GLASGOW_HASKELL__ < 700
-- Code that uses block
#else
-- Code that uses mask
#endif
@nh2
We probably want to make parallel jobs a config setting as well, or use the same number as the existing --jobs.
We can make cabal build
read the jobs
config file setting, but it shouldn't be used when the package is built during the execution of an install plan (since there's no way to limit the number of parallel build jobs from cabal install
ATM).
GLASGOW_HASKELL
Nice, pushed that.
I haven't rebased on the latest master yet
Just rebased that.
My GSoC 2013 project proposal has been accepted.
Awesome! Let's give this build system another integer factor speedup! :)
We can make cabal build read the jobs config file setting, but it shouldn't be used when the package is built during the execution of an install plan (since there's no way to limit the number of parallel build jobs from cabal install ATM).
Do you mean with this: When we use install -j
and build -j
, we get more than n
(e.g. n*n
) jobs because the two are not coordinated?
Do you mean with this: When we use install -j and build -j, we get more than n (e.g. n*n) jobs because the two are not coordinated?
Yes. The plan is to use an OS-level semaphore for this, as outlined above.
That's what I meant, sounds good. We should use this semaphore here. This way we get parallel profiling lib building for free with install -j.
@tibbe
That's what I meant, sounds good. We should use this semaphore here. This way we get parallel profiling lib building for free with install -j.
Yes, that's the plan.
@23Skidoo I made a prototype at https://github.com/nh2/cabal/compare/build-executables-in-parallel. It would be nice if you could take a look.
I made a pull request for this https://github.com/haskell/cabal/pull/1540, rebased on current master. It's much easier to not lose track of things when they are in pull request form.
@23Skidoo Please tell me if you made recent changes that I should make use of there.
@23Skidoo I believe this is done now right, or are you still waiting to submit your PR?
I need to rework #1572; @dcoutts doesn't want to merge it in the current state. I hope to get it into 1.20.
@23Skidoo what happened to this? Would this need to be restarted from scratch?
There's ghc --make -j
now, which is used by cabal build
automatically, though not by cabal install -j
. Perhaps we should close this ticket and open a new one for making cabal install -j
use ghc --make -j
. #4174 is also relevant as a possible alternative to --make -j
.
@23Skidoo is that also true for new-build
?
I'm curious as cabal new-build cabal-install --allow-newer
seems not to build in parallel.
I think it should. If you run it with -v3
, it'll show you the ghc invocation string. You can also try new-build -j
.
On second thought, since new-build
also installs dependencies, it probably suffers from the same problem as install -j
and doesn't use ghc --make -j
.
I didn't manage to go through the massive amount of output -v3
produces. However looking at the processes, I see a single ghc process running. Where I'd like to see 4 or preferably even 8. Therefore I believe cabal new-build -j
does not parallelize :-(
Yep, see my second comment. We should open a ticket for making install/new-build -j
use ghc --make -j
. I have some initial code on this branch: https://github.com/23Skidoo/cabal/commits/num-linker-jobs
@23Skidoo I'm perfectly fine with the idea of opening a second ticket, which details this; and closing this ticket. Please do! I just wanted to note down how this behaved on my system.
When doing so, just don't forget that ghc --make -j
is horribly inefficient and needs RTS flags (like described on http://trofi.github.io/posts/193-scaling-ghc-make.html) to be of any use (otherwise it will be slower than non-j builds).
@nh2 so presumably we want
ghc --make -j +RTS -A256M -qb0 -RTS
@trofi do you agree? Will this eventually be in GHC?
So, based on this discussion, there seem to be two distinct issues:
ghc --make -j
-j
to ghc
when building things in parallel.(2) should not be hard to implement, although it will be harder to demonstrate that it actually speeds things up (due to the fact that ghc -j
scales pretty horribly). Perhaps we should make a separate ticket for it, though I don't think we should close this one.
If someone outlines how to do this proper, I might take a look at doing so. My main quibble is that "GHC is slow" is a known theme. And if we aggravate this with cabal, we are potentially leaving build performance on the table.
@angerman Well, which strategy did you want to do?
@ezyang what's the optimal solution we can hope for here? Turn cabal into using shake (#4174) and then see if how much of the additional build server we need?
@23Skidoo, if I understand correctly might have part of (2) already done?
As I understand it, here is the optimal solution:
In the end you have a single Shake build system which knows about parallelism down to the individual Haskell source file level, and can do maximal parallelism.
...I don't expect this to be implemented any time soon.
I believe those three items can be built in separate steps with increasing productivity with each additional layer?
Could cabal produce a cmake (or make) file instead and feed that into build system? I believe that's what ghc-cabal does to some extend?
I believe those three items can be built in separate steps with increasing productivity with each additional layer?
Yes.
Could cabal produce a cmake (or make) file instead and feed that into build system? I believe that's what ghc-cabal does to some extend?
Well, depends on what you mean by make. ghc -M
knows how to create a Makefile for building Haskell. Cabal itself, in principle, knows about dependencies between components in a package, but this information isn't reified anywhere in the code today. Plus, a large amount of the processing is done by Haskell code in process, so your Makefile wouldn't have anything to call.
@ezyang Great! So one could start with the GHC build server for example and get somewhere.
What I'm wondering is if cabal, which has the notion of packages, targets and corresponding Modules/Files, (and flags), could generate a Makefile (or CMakeLists.txt -- assuming there was some plumbing for haskell), that exposed those targets.
And at least in the case of CMake, could be used to generate ninja files, which then in turn could be compiled using ninja as a build system, or even shake (which as far as I understand, can read ninja files).
This brings another question up. I believe there is some ghc --make
powered by shake floating around somewhere, would investing to get that into ghc proper, help us with the -j
scaling?
What I'm wondering is if cabal, which has the notion of packages, targets and corresponding Modules/Files, (and flags), could generate a Makefile (or CMakeLists.txt -- assuming there was some plumbing for haskell), that exposed those targets.
The big problem is that many operations which need to be done while building can't be characterized as just "run this and that command." There's a lot of Haskell code that gets run during a build, that needs to get run, and is not exposed as a command in any way. Shake has a similar problem: it's more expressive than ninja, so you can't take a Shake build system and turn it into ninja.
I believe there is some ghc --make powered by shake floating around somewhere, would investing to get that into ghc proper, help us with the -j scaling?
There are two. You have https://github.com/ndmitchell/ghc-make which is implemented by calling ghc -c
(you get parallelism but it is slower than --make
sequentially), and https://github.com/ezyang/ghc-shake which uses the GHC API and cannot be parallelized out of process.
Neither of these can be put into GHC because they depend on Shake and GHC does not want to take Shake on as a boot library at this time.
Updated summary by @ezyang. Previously, this ticket talked about all sorts of parallelism at many levels. Component-level parallelism was already addressed in #2623 (fixed by per-component builds), so all that remains is per-module parallelism. This is substantially more difficult, because right now we build by invoking
ghc --make
; achieving module parallelism would require teaching Cabal how to build usingghc -c
. But this too has a hazard: if you don't have enough cores/have a serial dependency graph,ghc -c
will be slower, because GHC spends more time reloading interface files. In https://github.com/haskell/cabal/issues/976#issuecomment-7016707 @dcoutts describes how to overcome this problem.There are several phases to the problem:
First building the GHC build server and parallelism infrastructure. This can be done completely independently of Cabal: imagine a program which has a command line identical to GHC, but is internally implemented by spinning up multiple GHC processes and farming out the compilation process. You can tell if this was worthwhile when you get scaling better than GHC's built-in
-j
and a traditional-c
setup.Next, we need to teach Cabal/cabal-install how to take advantage of this functionality. If you implemented your driver program with exactly the same command line flags as GHC, then this is as simple as just passing
-w $your_parallel_ghc_impl
. However, this is a problem doing it this way: cabal-install will attempt to spin up N parallel package/component builds, which each in turn will try to spin up M GHC build servers; this is bad; you want the total number of GHC build servers to equal the number of cores. So then you will need to setup some sort signalling mechanism to avoid too many build servers from running at once, OR have cabal new-build orchestrate the entire build down to the module level so it can plan parallelism (but you would probably have to rearchitect according to #4174 before you can do this.)Now that the package-level parallel install has been implemented (see #440), the next logical step is to extend
cabal build
with support for building multiple modules, components and/or build variants (static/shared/profiling) in parallel. This functionality should be also integrated withcabal install
in such a way that we don't over- or underutilise the available cores.A prototype implementation of a parallel
cabal build
is already available as a standalone tool. It works by first extracting a module dependency graph with 'ghc -M' and then running multiple 'ghc -c' processes in parallel.Since the parallel install code uses the external setup method exclusively, integrating parallel
cabal build
with parallel install will require using IPC. A single coordinatingcabal install -j N
process will spawn a number ofsetup.exe build --semaphore=/path/to/semaphore
children, and each child will be building at most N modules simultaneously. An added benefit of this approach is that nothing special will have to be done to support custom setup scripts.An important issue is that compiling with
ghc -c
is slow compared toghc --make
because the interface files are not cached. One way to fix this is to implement a "build server" mode for GHC. Instead of repeatedly runningghc -c
, each build process will spawn at most N persistent ghcs and distribute the modules between them. Evan Laforge has done some work in this direction.Other issues:
cabal repl
patches).build-type: Simple
.