snowleopard / hadrian

Hadrian: a new build system for the Glasgow Haskell Compiler. Now merged into the GHC tree!
https://gitlab.haskell.org/ghc/ghc/tree/master/hadrian
MIT License
208 stars 39 forks source link

Generic library rules #571

Closed alpmestan closed 6 years ago

alpmestan commented 6 years ago

This patch implements the suggestion by @angerman in #538 -- to "register" the rules for building libraries (.a, .so, .dylib, etc) without having to parse any Cabal file. This dramatically decreases the "boot" time of hadrian and will, as soon as we address the problem I describe below, allow us to use the -c option again.

Instead of defining a whole bunch of precise rules for <build root>/stage<N>/<path/to/lib/in/ghc/tree>/build/libHS... for each package, we just define catch-all rules for each type of library (extension, really) and use simple parsec parsers (parsec is already a dependency, through Cabal, since ghc 8.4.1 / Cabal 2.2) to deconstruct the path and basically turn it into a suitable library Context.

The remaining issue: I still seem to be able to build a GHC just fine if I run the ./boot && ./configure step myself explicitly, but not if I start from a clean source tree and run hadrian with -c. The error I get in that case is:

...
[hadrian/cfg/system.cfg, settings, mk/config.h]
about to need ["hadrian/cfg/system.config.in","settings.in","mk/config.h.in"]
about to run ./configure
in doWith
builder ready
shakeArgsWith                      0.000s    0%                           
Function shake                     0.272s   65%  =========================
Database read                      0.001s    0%                           
With database                      0.000s    0%                           
Running rules                      0.140s   33%  ============             
Pool finished (14 threads, 4 max)  0.000s    0%                           
Total                              0.414s  100%                           
Build system error - indirect recursion detected:
  Key value 1:  OracleQ (KeyValue ("hadrian/cfg/system.config","host-os"))
  Key value 2:  hadrian/cfg/system.config
  Key value 3:  hadrian/cfg/system.config settings mk/config.h
  Key value 4:  OracleQ (KeyValue ("hadrian/cfg/system.config","project-version"))
Rules may not be recursive

(the first few lines come from debugging output I added -- see the patch)

Does anyone have any idea of where that cycle comes from and why my patch makes this happen?

ndmitchell commented 6 years ago

Try running with -j1 and you might get a better description of the cycle.

alpmestan commented 6 years ago

@ndmitchell Thanks for the tip! In this particular case though it doesn't seem to unlock a more precise description:

$ hadrian/build.sh -c -j1 --trace
Up to date
Up to date
[hadrian/cfg/system.cfg, settings, mk/config.h]
about to need ["hadrian/cfg/system.config.in","settings.in","mk/config.h.in"]
about to run ./configure
in doWith
builder ready
shakeArgsWith                     0.000s    0%                           
Function shake                    0.315s   80%  =========================
Database read                     0.000s    0%                           
With database                     0.000s    0%                           
Running rules                     0.077s   19%  ======                   
Pool finished (1 threads, 1 max)  0.000s    0%                           
Total                             0.393s  100%                           
Build system error - indirect recursion detected:
  Key value 1:  OracleQ (KeyValue ("hadrian/cfg/system.config","host-os"))
  Key value 2:  hadrian/cfg/system.config
  Key value 3:  hadrian/cfg/system.config settings mk/config.h
  Key value 4:  OracleQ (KeyValue ("hadrian/cfg/system.config","project-version"))
Rules may not be recursive

Now, if we carefully look at the error, we see that hadrian wants the config file for the host-os setting, so it then starts looking for a rule to produce it as it doesn't exist on a clean tree, it then finds the rule from Rules.Configure that produces it, along with the settings and mk/config.h files. Finally, that rule apparently ends up requiring the project-version setting from the config file.

The debugging output I added suggests that hadrian errors out when executing this line:

doWith :: (Builder b, ShakeValue c)
       => (b -> BuildInfo -> Action a)
       -> (Target c b -> Action ())
       -> [(Resource, Int)] -> [CmdOption] -> Target c b -> Args c b -> Action a
doWith f info rs opts target args = do
    putLoud "in doWith"
    needBuilder (builder target)
    putLoud "builder ready"
    argList <- interpret target args          -- /!\ HERE /!\ 
    putLoud $ "args: " ++ show argList
    trackArgsHash target
    putLoud "args hash tracked"
    info target
    putLoud "target info printed"
    verbose <- interpret target verboseCommand
    putLoud "target interpreted"
    let quietlyUnlessVerbose = if verbose then withVerbosity Loud else quietly
    quietlyUnlessVerbose $ f (builder target) $
        BuildInfo { buildArgs      = argList
                  , buildInputs    = inputs target
                  , buildOutputs   = outputs target
                  , buildOptions   = opts
                  , buildResources = rs }
snowleopard commented 6 years ago

@alpmestan Thanks for the PR! I'm running from one meeting to another today, will hopefully find more time to have a look at it later today :)

alpmestan commented 6 years ago

Alright, -c is fixed by https://github.com/snowleopard/hadrian/pull/571/commits/502e2c3184278b00dc9ff84953397df3d6c339fc. I switched all the .travis.yml scripts to using it again. I left the appveyor one alone as Windows currently requires that distro toolchain switch to ./configure and I don't think our configure rule currently has that logic. I'm leaving this out, to be fixed whenever we get rid of the need for the distro toolchain flag.

I also changed the call to boot to use --hadrian, which prevents the generation of all the .mk files used by ghc-cabal.

Now, let's see how the travis tests go.

snowleopard commented 6 years ago

@alpmestan Looks like we still have some cyclic rules failure on AppVeyor:

Error when running Shake build system:
* _build/stage1/lib/package.conf.d/integer-gmp-1.0.2.0.conf
* _build/stage1/libraries/integer-gmp/build/HSinteger-gmp-1.0.2.0.o
* _build/stage1/gmp/include/ghc-gmp.h
* _build/stage1/gmp/.libs/libgmp.a
Build system error - key matches multiple rules:
  Key type:       FileQ
  Key value:      _build/stage1/gmp/.libs/libgmp.a
  Rules matched:  2
Modify your rules/defaultRules so only one can produce the above key

Full build log: https://ci.appveyor.com/project/snowleopard/hadrian/build/1.0.1076

ndmitchell commented 6 years ago

This isn't a cyclic value error, it's overlapping rules - way easier to fix. Just find which patterns match and adjust one.

snowleopard commented 6 years ago

@ndmitchell Ah, indeed!

alpmestan commented 6 years ago

@snowleopard Alright, the selftest + the 3 builds that we do on travis all succeed! Modulo that weird error at the end (Lint checking error - 14 values have changed since being depended upon: ...).

@alpmestan Looks like we still have some cyclic rules failure on AppVeyor:

I'll look into this.

alpmestan commented 6 years ago

OK, trying something for the error you mentionned above. See last commit.

snowleopard commented 6 years ago

Alright, the selftest + the 3 builds that we do on travis all succeed!

@alpmestan Indeed! 🎉 After the fix in #574 this will hopefully succeed on AppVeyor as well :)

alpmestan commented 6 years ago

Woooohooooooo, all builds are green on travis. :tada: Let's see what appveyor says.

alpmestan commented 6 years ago

End of the build log on appveyor:

/--------------------------------------------------------------------------\
| Successfully built program 'haddock' (Stage1).                           |
| Executable: _build/stage1/bin/haddock.exe                                |
| Program synopsis: A documentation-generation tool for Haskell libraries. |
\--------------------------------------------------------------------------/
Writing report to -
* This database has tracked 1 runs.
* There are 19178 rules (19178 rebuilt in the last run).
* Building required 4255 traced commands (4255 in the last run).
* The total (unparallelised) time is 59m58s of which 53m47s is traced commands.
* The longest rule takes 1m37s (_build/stage0/compiler/build/HsInstances.o _build/stage0/compiler/build/HsInstances.hi), and the longest traced command takes 1m36s (ghc.exe).
* Last run gave an average parallelism of 1.64 times over 32m50s.
shakeArgsWith                       0.001s    0%                           
Function shake                      0.387s    0%                           
Database read                       0.001s    0%                           
With database                       0.000s    0%                           
Running rules                    1969.551s   99%  =========================
Pool finished (926 threads, 2 max)  0.002s    0%                           
Lint checking                       1.013s    0%                           
Profile report                      0.228s    0%                           
Total                            1971.182s  100%                           
Build completed in 32:52m
cd ..
_build/stage1/bin/ghc -e 1+2
'_build' is not recognized as an internal or external command,
operable program or batch file.
Command exited with code 1

Should we not cd there?

snowleopard commented 6 years ago

Woohoo!

Indeed, looks like we need to fix ApoVeyor script. Would you like to do that?

ndmitchell commented 6 years ago

Windows requires foo\bar for command lines to run for the first component. Shake cmd changes foo/bar bar/baz to foo\bar bar/baz inside, but if you are running on Windows directly you'll need to swap the slash on the executables.

alpmestan commented 6 years ago

Indeed, looks like we need to fix ApoVeyor script. Would you like to do that?

Sure. It's just weird that we have to remove cd .. all of a sudden, isn't it? Anyway, pushed a commit that tries just that.

alpmestan commented 6 years ago

Did the appveyor build get killed?

snowleopard commented 6 years ago

@alpmestan I've left a couple of minor comments, but overall this PR looks great, thank you! I'm looking forward to simplifying other build rules following the same approach :)

alpmestan commented 6 years ago

Hmm, the appveyor build fails with:

_build\stage1\bin\ghc -e 1+2
ghc: could not execute: gcc
Command exited with code 1
snowleopard commented 6 years ago

@alpmestan I guess this failure could be fixed after #564, so let's not bother about it for now.

alpmestan commented 6 years ago

@snowleopard I think I'm done addressing your feedback. Anything else you want me to do on this patch?

snowleopard commented 6 years ago

@alpmestan No, looks great -- merged. Thanks again :)