haskell / cabal

Official upstream development repository for Cabal and cabal-install
https://haskell.org/cabal
Other
1.62k stars 697 forks source link

Sharing of object files between executable builds? #81

Closed bos closed 9 years ago

bos commented 12 years ago

(Imported from Trac #89, reported by bjorn on 2006-07-17)

When building a package which contains multiple executables which share some modules, each module is recompiled for every executable. This can increase the compile time substantially. One problem with compiling each module only once is that different executables can have different compiler options, which can for example affect preprocessing. Maybe modules could be shared if all executables have the same compiler options?

bos commented 12 years ago

(Imported comment by @dcoutts on 2006-07-17)

Also requested in http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=293523 by John Goerzen (adding to cc list).

The technical problem here is that we have to register the library in place and then do the build against that library. We also have to arrange so that when building the executables against the library, we do not end up pulling in the libraries source files in preference to the compiled modules from the built library package. That's a bit tricky if the library and executable source files share the same source directories as will often be the case. I'm not sure if it is possible with ghc --make to prefer the modules in a package to the locally found source files. If not then it would depend on Cabal doing the build without --make using ghc in single-shot mode. See #15.

bos commented 12 years ago

(Imported comment by guest on 2007-12-17)

I can't remember if this is already the case, but I think things are much simpler if Cabal only operates on files under dist/, e.g. dist/src/library and dist/src/executable3 for the library and the 3rd executable.

That way we don't have to worry about GHC making files like Foo_stub.c in the user's directory, that we'd then have to clean up later, nor do we have to worry about it finding the wrong files in the above case. We just have to remember which modules are in the library and not copy them into disrc/src/executable*.

Ian Lynagh

bos commented 12 years ago

(Imported comment by guest on 2007-12-17)

I'd mention that this lack of sharing between sections has bitten me recently (as in an error and not an inefficiency sense) while I was updating Greencard.

So the Greencard package provides two main things; it provides a src/ which contains all sorts of modules which gets compiled down to the executable 'greencard-bin', and it provides a library, Foreign.Greencard, which naturally requires Greencard to be compiled.

The makefile accomplishes this by building greencard-bin in place in src/, and then running something like ../src/greencard-bin on the lib/ files. Everything comes out hunky-dory. But when put into Cabal format, it needs to be split between Executable and Library sections. So, suppose someone goes to install a Cabalized greencard. (This means no greencard is installed.) Cabal begins the configuration. It detects that no greencard is accessible - 'No greencard found'. It's otherwise happy, and it begins chasing down the files. Everything in 'Executable greencard'? Good. .lhs and .hs files, no sweat. Everything in 'Library'? Uh oh! This 'lib/Foreign/GreenCard.gc', it requires 'greencard'!

And the build barfs. Which is silly, since it could and would be able to build the greencard necessary for the library!

I think I'm going to have split greencard up, and have a greencard and greencard-lib, making greencard-lib depend on greencard. This was the solution I used the last time I encountered a problem like this...

gwern

bos commented 12 years ago

(Imported comment by @dcoutts on 2008-03-10)

Replying to guest:

I'd mention that this lack of sharing between sections has bitten me recently (as in an error and not an inefficiency sense) while I was updating Greencard.

I don't think this is your problem. Your problem is that there's no way to specify that a build-tool dependency is in fact provided by a particular package and further that there is no way to get that kind of dependency between libraries/executables in the same package.

The sharing is incidental, having it would not fix your problem.

It's otherwise happy, and it begins chasing down the files. Everything in 'Executable greencard'? Good. .lhs and .hs files, no sweat. Everything in 'Library'? Uh oh! This 'lib/Foreign/GreenCard.gc', it requires 'greencard'! And the build barfs. Which is silly, since it could and would be able to build the greencard necessary for the library!

It'd be able to build that executable but it would not know to use it as the greencard tool. See #227.

I think I'm going to have split greencard up, and have a greencard and greencard-lib, making greencard-lib depend on greencard. This was the solution I used the last time I encountered a problem like this...

Sounds sensible.

bos commented 12 years ago

(Imported comment by @dcoutts on 2008-03-10)

See also #276.

bos commented 12 years ago

(Imported comment by @kowey on 2008-08-17)

As requested by Duncan, I am noting here why this would be useful for the darcs team: we plan to introduce some sort of libcabal at some point, but without sharing object files we would have to recompile either 130+ modules between libdarcs and darcs.

To make matters worse, we also have a make_authors and a preproc program which compiles our AUTHORS file and helps build our user manual respectively. We also used to have a make_changelog program. All of these programs re-use a good chunk of what will soon be libdarcs, so without proper sharing of object files, we would be compiling the same files at least 4 times.

Thanks!

bos commented 12 years ago

(Imported comment by @samb on 2008-11-12)

Not having this is also be a for, e.g., compiling the same program with and without profiling in two different sections (for example, lhc and lhcp) when it uses TH -- this would mean having to compile every module three times instead of twice to build it both ways.

bos commented 12 years ago

(Imported comment by guest on 2009-02-07)

Here's my patch to fix this issue, which Duncan is reviewing for me:

http://upcycle.it/~blackh/cabal/cabal-ticket-89-v5.darcs-send

bos commented 12 years ago

(Imported comment by @dcoutts on 2009-05-19)

Thanks to Stephen we've got this one done finally.

You can now have executables specify build-depends on the library in the same package. You must specify cabal-version: >= 1.8 to use this feature. Note that the executable must not specify the same hs-src-dirs as the library or it'll just pick up the source files rather than using the built library.

Remaining:

bos commented 12 years ago

(Imported comment by maltem on 2009-05-31)

I tried to test the new feature, but the current version in darcs is 1.7.2, so configuring a package fails:

Error: This package requires Cabal version: >=1.8

This is what I did, did I do something wrong? Or should I just “adjust” the version number for the test?

darcs get --partial [http://darcs.haskell.org/cabal/](http://darcs.haskell.org/cabal/) cd cabal cabal install cd ../my_package ghc --make Setup -package Cabal-1.7.2 ./Setup configure
bos commented 12 years ago

(Imported comment by blackh on 2009-07-03)

maltem, the work on ticket 89 is completed and checked except for some minor bits that are not quite finished.

However, the error message is telling lies. The minimum version for the ticket 89 feature to be activated is actually 1.7.1. (I assume the reason for this little porkie is that Duncan is intending 1.8 to be the version when it is released.) So, to test it, use:

cabal-version: >= 1.7.1

Another point is that GHC will always look in the build directory in preference to using a library, so if you want to make it treat your internally defined library just like an external one (and therefore build each source file only one), you have to use

hs-source-dirs: programs

or similar on either your library or executable definition or both with different directory names, to keep the source files separate.

There's an example of this in cabal/tests/PackageTests/BuildDeps/InternalLibrary1

bos commented 12 years ago

(Imported comment by maltem on 2009-07-05)

Splendid, thanks for the explanation. Also, looking forward to 1.8 :)

bos commented 12 years ago

(Imported comment by @dcoutts on 2009-07-05)

Feature included in Cabal-1.8 release.

Two bits remaining however:

bos commented 12 years ago

(Imported comment by AnttiJuhaniKaijanaho on 2009-12-19)

From what I understand, the patch above does not solve the original request: the useless rebuilding of Other-Modules shared by multiple Executables.

bos commented 12 years ago

(Imported comment by @dcoutts on 2010-01-27)

Replying to AnttiJuhaniKaijanaho:

From what I understand, the patch above does not solve the original request: the useless rebuilding of Other-Modules shared by multiple Executables.

Right the current system does not help for packages with several executables but no library. For that we need to support convenience / private libraries as in #276.

I do not plan to implement ad-hoc / implicit / opportunitsic sharing of .o files. It is too hard to implement at the moment. It requires tracking lots of information to check if the sharing is safe. It might become possible in a future build system that obsessively tracks all dependencies.

bos commented 12 years ago

(Imported comment by AnttiJuhaniKaijanaho on 2010-01-28)

For me as a user, this feature request is obvious and it's rather disappointing that Cabal will not support it. A simple makefile-based system, or even running ghc --make by hand!, does better on this count.

What about allowing specifying a list of modules common to all executables? That would help me. (Or, of course, if the "private libraries" are lightweight enough that I just have to list the modules that are included in it, that's good enough.)

bos commented 12 years ago

(Imported comment by @dcoutts on 2010-01-28)

Replying to AnttiJuhaniKaijanaho:

For me as a user, this feature request is obvious and it's rather disappointing that Cabal will not support it. A simple makefile-based system, or even running ghc --make by hand!, does better on this count.

With the makefile you are explicitly sharing modules and you specify the compile options once for each source file. For ghc --make you simply get wrong results (it does not track when you change compile options).

As a design choice (one made long ago) Cabal lets you specify different compile options for the same source file when used in different components.

What about allowing specifying a list of modules common to all executables? That would help me. (Or, of course, if the "private libraries" are lightweight enough that I just have to list the modules that are included in it, that's good enough.)

That would be similar though I think I prefer the private library approach, it's a bit more flexible. The "common modules" approach does not work for sharing modules between a library and an executable with ghc because the module need to be compiled differently.

bos commented 12 years ago

(Imported comment by AnttiJuhaniKaijanaho on 2010-01-31)

Replying to @dcoutts:

For ghc --make you simply get wrong results (it does not track when you change compile options).

Which is relevant in sophisticated cases. I'm concerned that cabal makes simple things more complicated than they need to be.

As a design choice (one made long ago) Cabal lets you specify different compile options for the same source file when used in different components.

Which is a nice thing to have when you need it.

What about allowing specifying a list of modules common to all executables? That would help me. (Or, of course, if the "private libraries" are lightweight enough that I just have to list the modules that are included in it, that's good enough.)
That would be similar though I think I prefer the private library approach, it's a bit more flexible. The "common modules" approach does not work for sharing modules between a library and an executable with ghc because the module need to be compiled differently.

Similarly, I'd like a way to specify compilation options (such as warnings) common to all executables.

... Heh. Should I be opening a separate ticket? :)

bos commented 12 years ago

(Imported comment by @dcoutts on 2010-02-01)

Replying to AnttiJuhaniKaijanaho:

Replying to @dcoutts:
For ghc --make you simply get wrong results (it does not track when you change compile options).
Which is relevant in sophisticated cases. I'm concerned that cabal makes simple things more complicated than they need to be.

Just to note, this is not a model we ever intend to support in the "Simple" build system. Build systems should be purely functional.

The one concession we will need to make to a purely functional description is the ability to specify exceptions that some changed input will not trigger another function to be recalculated (eg while hacking, temporarily specifying that stage1 of a compiler changing will not cause rebuild of stage2).

... Heh. Should I be opening a separate ticket? :)

Probably. A common section is not a bad thing. Note that this would not give any sharing of build results. It's just a shortcut to putting the same options in each section.

bos commented 12 years ago

(Imported comment by AnttiJuhaniKaijanaho on 2010-02-07)

Replying to @dcoutts:

Replying to AnttiJuhaniKaijanaho: I'm concerned that cabal makes simple things more complicated than they need to be. Just to note, this is not a model we ever intend to support in the "Simple" build system. Build systems should be purely functional.

I don't see how sharing build results would break pure functionality.

... Heh. Should I be opening a separate ticket? :)
Probably. A common section is not a bad thing. Note that this would not give any sharing of build results. It's just a shortcut to putting the same options in each section.

Opened as #630.

(I reiterate: not sharing of build results is ugly. There is a passable argument against it in the genreal case, but in simple cases its lack just gives me another reason to NOT use Cabal when I can get away with it.)

bos commented 12 years ago

(Imported comment by guest on 2010-02-07)

Trying to build the example (using Setup.hs):

cabal/tests/PackageTests/BuildDeps/InternalLibrary1

Ends up passing to ghc: -package-id InternalLibrary1-0.1

Failing with: <command line>: cannot satisfy -package-id InternalLibrary1-0.1

Looks like Cabal should pass -package, or find the full hash for InternalLibrary1

I'm on ghc-6.12.1, Cabal-1.8.0.2

Adam Vogt

bos commented 12 years ago

(Imported comment by guest on 2010-02-28)

I've made a patch implementing the hack I outlined in the previous comment (which works for me):

http://code.haskell.org/~aavogt/cabal-1.8-dev/hack-to-correct-build-failure-with-ghc_6_12_1-_issue-89__.dpatch

or,

darcs pull http://code.haskell.org/~aavogt/cabal-1.8-dev

bos commented 12 years ago

(Imported comment by StephenBlackheath on 2010-02-28)

Here's my first attempt at a patch to fix the wrong-package-id/ghc-6.12 issue more or less the right way:

http://code.haskell.org/~StephenBlackheath/ticket89/ticket89-wrong-package-id-v1.darcs-send

Here is my patch for cabal install to fix the other remaining ticket 89 problem where 'cabal install' fails with "internal error: could not construct a valid install plan." on any package that depends on an internally-defined library:

http://code.haskell.org/~StephenBlackheath/ticket89/ticket89-cabal-install-v2.darcs-send

Duncan told me he can't get to this right now, but in the meantime I'd appreciate anyone that wants to test or review these patches.

bos commented 12 years ago

(Imported comment by guest on 2010-03-08)

Stephen's patches work for me.

Adam Vogt

bos commented 12 years ago

(Imported comment by @dcoutts on 2010-03-09)

Inspired by Stephen's patches I've committed these two fixes:

Sat Mar 20 18:21:08 CET 2010  Duncan Coutts <duncan@haskell.org>
- Fix local inplace registration for ghc-6.12
  This is for the case of intra-package deps where the lib has to be
  registered into a local package db. We use "-inplace" suffix for
  the local installed package ids (rather than using an ABI hash).
  
And for cabal-install:
Sat Mar 20 22:53:31 CET 2010  Duncan Coutts <duncan@haskell.org>
- Cope with intra-package deps when constructing install plans
  
The main difference is that instead of trying to work out what the ABI hash of the inplace package is, we just use the package name with an "-inplace" suffix for the local inplace installed package ID. Needs a little testing then I'll push to the stable branches.
bos commented 12 years ago

(Imported comment by StephenBlackheath on 2010-03-21)

Thanks, Duncan, your patches fix everything for me on my project.

bos commented 12 years ago

(Imported comment by @dcoutts on 2010-04-05)

It's been reported that cabal haddock is failing for packages using this feature.

bos commented 12 years ago

(Imported comment by PaulBrauner on 2010-04-13)

It looks that it doesn't solve the problem with a library of mine (cabal-install version 0.8.2 using version 1.8.0.4 of the Cabal library): I declare a library A and two excutables B and C which build-depend on A. It compiles, but A get compiled three times: once for the library, once for B and ones for C. Is that "normal" ?

bos commented 12 years ago

(Imported comment by StephenBlackheath on 2010-06-01)

Paul - You have to use the hs-source-dirs: option and make sure you are not building your executables in the same directory as your library, because GHC will always choose source files in the local directory in preference to libraries. It would certainly be nice to fix this, but it would need to be fixed in GHC.

jsl commented 9 years ago

Based on the comments, it appears that this issue was fixed in 2010. I propose closing this ticket. Please re-open another issue if this problem still exists.

/cc @tibbe