snowleopard / hadrian

Hadrian: a new build system for the Glasgow Haskell Compiler. Now merged into the GHC tree!
https://gitlab.haskell.org/ghc/ghc/tree/master/hadrian
MIT License
208 stars 39 forks source link

Use Cabal directly in place of ghc-cabal + make build root configurable #531

Closed alpmestan closed 6 years ago

alpmestan commented 6 years ago

This commit implements two significant changes (that were not easy to separate):

The code for this was mostly taken from #445. I have successfully built functional quick/prof/perf/devel2/quick-cross builds of GHC with this branch. I might commit another tweak or two as I'm now testing the test/docs rules. But I'm not expecting major changes there.

I'm also not clear on what we want to do with the install/wrapper rules, with relocatable GHC builds.

izgzhen commented 6 years ago

I found some remaining bits by grepping inplace, are they relevant now?

src/Base.hs:-- | Path to the inplace package database used in 'Stage1' and later.
src/Context.hs:-- | Path to inplace package configuration file of a given 'Context'.
src/Context.hs:    return $ path -/- "inplace-pkg-config"                           
src/GHC.hs:             return (top -/- "inplace/mingw/bin/strip.exe")              
src/Rules/PackageData.hs:    dir -/- "inplace-pkg-config" %> \conf -> do            
src/Rules/Program.hs:          -- Rules for the GHC package, which is built 'inplace'
izgzhen commented 6 years ago

it told me Warning: No want/action statements, nothing to do when I run ./build.sh, why?

alpmestan commented 6 years ago

it told me Warning: No want/action statements, nothing to do when I run ./build.sh, why?

It tweaks the toplevel rules and surely changes the default behaviour, you can try "stage2" as a target if you want. We'll probably want to change this a little.

izgzhen commented 6 years ago

It tweaks the toplevel rules and surely changes the default behaviour, you can try "stage2" as a target if you want. We'll probably want to change this a little.

Thanks. You can try updating the README to reflect the changes that will effect users as well :)

alpmestan commented 6 years ago

Thanks. You can try updating the README to reflect the changes that will effect users as well :)

Oh yes, I definitely plan to augment the README with a section about --build-root, and possibly more depending on what we decide for the toplevel rules. I just would like to fix at least docs (done since a few minutes ago) & test rules before that.

izgzhen commented 6 years ago

a weird directory appeared in ghc after building, what is that?

-> % tree path
path
└── to
    └── ghc-split
izgzhen commented 6 years ago

Regarding the docs target:

-> % hadrian/build.sh -j docs
Old pre cabal 2.1 version detected. Falling back to legacy 'cabal sandbox' mode.
Preprocessing executable 'hadrian' for hadrian-0.1.0.0..
Building executable 'hadrian' for hadrian-0.1.0.0..
Running hadrian...
| Run Sphinx Html: docs/users_guide => _build/docs/html/users_guide
| Run Sphinx Latex: utils/haddock/doc => /tmp/extra-dir-48015799677772
| Run Sphinx Latex: docs/users_guide => /tmp/extra-dir-48015799677773
| Run Tar Create: _build/docs/html/Haddock => _build/docs/archives/Haddock.html.tar.xz
Warning: libraries/text/text.cabal:4:1: The field "bug-reports" is specified
more than once at positions 4:1, 42:1
Warning: libraries/terminfo/terminfo.cabal:4:1: The field "category" is
specified more than once at positions 4:1, 10:1
| Run Xelatex: Haddock.tex => /tmp/extra-dir-48015799677772
This is XeTeX, Version 3.14159265-2.6-0.99992 (TeX Live 2015/Debian) (preloaded format=xelatex)
 restricted \write18 enabled.
entering extended mode
(./Haddock.tex
LaTeX2e <2016/02/01>
Babel <3.9q> and hyphenation patterns for 3 language(s) loaded.
(./sphinxmanual.cls
Document Class: sphinxmanual 2009/06/02 Document class (Sphinx manual)
(/usr/share/texlive/texmf-dist/tex/latex/base/report.cls
Document Class: report 2014/09/29 v1.4h Standard LaTeX document class
(/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo)))
(/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty

Package inputenc Warning: inputenc package ignored with utf8 based engines.

)
! Undefined control sequence.
<recently read> \DeclareUnicodeCharacter 

l.5 \DeclareUnicodeCharacter
                            {00A0}{\nobreakspace}
No pages of output.
Transcript written on Haddock.log.
shakeArgsWith   0.000s    0%                           
Function shake  8.277s   77%  =========================
Database read   0.393s    3%  =                        
With database   0.027s    0%                           
Running rules   2.041s   19%  ======                   
Total          10.737s  100%                           
Error when running Shake build system:
* docs
* _build/docs/pdfs/Haddock.pdf
user error (Development.Shake.cmd, system command failed
Command: /usr/bin/xelatex -halt-on-error Haddock.tex
Current directory: /tmp/extra-dir-48015799677772
Exit code: 1
Stderr:
)
alpmestan commented 6 years ago

Regarding the docs target: [...]

Does the docs rule work with the tip of master? I don't think I've changed anything major for xelatex but I'll double-check.

snowleopard commented 6 years ago

Does the docs rule work with the tip of master?

@alpmestan Unfortunately there is no way to check since Hadrian is currently broken. However, it used to work just fine on my machine around a month ago.

izgzhen commented 6 years ago

@alpmestan update on docs: the previous error is probably caused by my outdated sphinx-build. It is fixed with a newer version. But I bumped into another error:

 Run Ghc CompileCWithGhc Stage0: utils/unlit/unlit.c => _build/stage0/utils/unlit/build/c/unlit.o
shakeArgsWith   0.000s    0%                           
Function shake  8.225s   47%  =======================  
Database read   0.360s    2%  =                        
With database   0.026s    0%                           
Running rules   8.790s   50%  =========================
Total          17.401s  100%                           
Error when running Shake build system:
* docs
* _build/docs/archives/libraries.html.tar.xz
* _build/docs/html/libraries/index.html
* _build/docs/html/libraries/ghc-prim/ghc-prim.haddock
* _build/docs/html/libraries/ghc-prim/haddock-prologue.txt
* OracleQ (ConfiguredCabalFile (Context {stage = Stage1, package = Package {pkgLanguage = Haskell, pkgType = Library, pkgName = "ghc-prim", pkgPath = "libraries/ghc-prim"}, way = v}))
* _build/stage1/libraries/ghc-prim/setup-config
* _build/stage1/lib/package.conf.d/rts-1.0.conf
* OracleQ (ConfiguredCabalFile (Context {stage = Stage1, package = Package {pkgLanguage = C, pkgType = Library, pkgName = "rts", pkgPath = "rts"}, way = v}))
* _build/stage1/rts/setup-config
* _build/stage0/bin/ghc
* OracleQ (ConfiguredCabalFile (Context {stage = Stage0, package = Package {pkgLanguage = Haskell, pkgType = Program, pkgName = "ghc-bin", pkgPath = "ghc"}, way = v}))
* _build/stage0/ghc/setup-config
* _build/stage0/lib/package.conf.d/ghc-8.5.conf
* OracleQ (ConfiguredCabalFile (Context {stage = Stage0, package = Package {pkgLanguage = Haskell, pkgType = Library, pkgName = "ghc", pkgPath = "compiler"}, way = v}))
* _build/stage0/compiler/setup-config
* _build/stage0/lib/package.conf.d/binary-0.8.5.1.conf
* _build/stage0/lib/package.conf.d/base-4.10.1.0.conf
* _build/stage0/lib/package.conf.d/rts.conf
user error (Development.Shake.cmd, system command failed
Command: /usr/local/bin/ghc-pkg --global-package-db _build/stage0/lib/package.conf.d register -v0 -
Exit code: 1
Stderr:
rts-1.0: package(s) with this id already exist: rts
rts-1.0: package rts-1.0 is already installed
rts-1.0: Package names may be treated case-insensitively in the future.
Package rts-1.0 overlaps with: rts-1.0 (use --force to override)
)
angerman commented 6 years ago

@izgzhen nuke _build and try again. There's probably a check missing to not re-register the rts in the bootstrap database.

izgzhen commented 6 years ago

I deleted the _build and restarted. Looks like docs builds well :)

angerman commented 6 years ago

We could also probably just --force register the packages. We'd be hosed in any case if someone switched the bootstrap compiler between runs.

Maybe not even trying to register the package in the bootstrap db, if it's present might be the cleaner choice.

angerman commented 6 years ago

Regarding stages

In general there are only two sets of file we generate: we either compile them by the bootstrap compiler, or we compile them with the stage1 compiler.

For proper stage separation, we'd want to keep the artifacts build with the same compile in the same stage.

Stage 0 (bootstrap)

This stage contains packages and utilities built by the bootstrap compiler, that are necessary to build the final compiler. We rely on packages that are shipped with the bootstrap compiler where we do not build those packages. That is: bootpackages + packages form the bootstrap compiler make up the package set that is used to build the next stage compiler.

No where do we place those? The are built artifacts of the bootstrap compiler and end up in stage0/bin and stage0/lib. For packages that are not in the bootpackages, we clone (copy & register) the packages from the bootstrap compiler.

Stage 1 (the final stage)

This stage contains packages and utilities built by the compiler from the previous stage.


Now we could just bump the stage by (+1). However, it also means that it doesn't make sense to build stage2/bin/haddock, and it should be stage1/bin/haddock.


I also trust @alpmestan to have tried to minimize the diff and extract only the necessary bits from #445, I also believe this has already taken a lot more time than initially anticipated.

snowleopard commented 6 years ago

@angerman Thank you! This looks good to me, but I don't understand why Hadrian is different from Make in terms of which GHC is used to build Haddock: Make uses ghc-stage2, i.e. the final GHC we build, whereas you propose for Hadrian to use ghc-stage1, i.e. the intermediate version of GHC. My understanding was that ghc-stage1 just didn't have enough features (or latest libraries) to build Haddock.

Could you clarify?

alpmestan commented 6 years ago

FWIW, this is what I get when running the "test" target with this branch:

SUMMARY for test run started at Wed Mar 21 07:23:03 2018 CET
 0:11:02 spent to go through
    6277 total tests, which gave rise to
   15457 test cases, of which
    9349 were skipped

      39 had missing libraries
    5059 expected passes
     170 expected failures

      44 caused framework failures
     108 caused framework warnings
      28 unexpected passes
     809 unexpected failures
       0 unexpected stat failures

Does anyone have a recent test run from a master build to compare this to? If not I'll spin one up myself later today.

snowleopard commented 6 years ago

@alpmestan I suggest we ignore the test results for now, let's focus on making CI bots happy. At the moment CI fails with:

Running hadrian...
dieVerbatim: user error (hadrian: Error Parsing: file "compiler/ghc.cabal" doesn't exist. Cannot
continue.

See https://travis-ci.org/snowleopard/hadrian/jobs/356207814.

izgzhen commented 6 years ago

another error from CI:

checking for ghc... /opt/ghc/8.0.2/bin/ghc
checking version of ghc... 8.0.2
configure: error: GHC version 8.2 or later is required to compile GHC.
The command "./boot --hadrian && ./configure" exited with 1.

consider remove the first os:linux where ghc is too old

izgzhen commented 6 years ago
Installing library in /home/travis/build/snowleopard/hadrian/ghc/_build/stage1/lib/../lib/x86_64-linux-ghc-8.5.20180321/rts-1.0
/home/travis/build/snowleopard/hadrian/ghc/_build/stage1/rts/build/libHSrts-1.0_debug.a: copyFile: does not exist (No such file or directory)
shakeArgsWith     0.000s    0%                           
Function shake   13.346s    1%                           
Database read     0.000s    0%                           
With database     0.000s    0%                           
Running rules  1210.730s   98%  =========================
Total          1224.076s  100%                           
Error when running Shake build system:
* stage2
* _build/stage1/lib/package.conf.d/rts-1.0.conf
ExitFailure 1
alpmestan commented 6 years ago

Yeah, don't worry I've been looking at those and I'm actively trying different things to get a green CI (disabled dynamic ways for now, updated ghc/cabal versions because GHC HEAD doesn't support 8.0.x anymore as a boot compiler, etc).

snowleopard commented 6 years ago

@alpmestan Thanks for your efforts! Hope you don't mind that I'm waiting until we have green CI before I start to properly review the PR :)

alpmestan commented 6 years ago

Alright, so I changed the versions of ghc/cabal-install to shift everything by one release, as ghc HEAD won't accept 8.0.x as a boot compiler anymore.

CI is green for GHC 8.2 & cabal-install 2.0 (which uses sandboxes to build hadrian). I had to disable the dynamic-enabled ways in Settings.Default for the build to complete (otherwise we get the error from this comment). @angerman and I are looking into this.

For GHC 8.4 & cabal-install 2.2, we're blocked on having a release of alex that ships with this commit. I asked @simonmar about this, hopefully we will get a new release soon. In the meantime this particular combination will just keep failing -- should we comment it out for now?

Not sure what's up with the OS X CI though, this indicates we should probably update the way we install python3 when we put together the build environment, but I know very very little about brew.

snowleopard commented 6 years ago

@alpmestan I think we need at least one green CI with a full build, where we build _build/stage1/bin/ghc and can run it with -e 1+2. I don't mind which OS/bootstrapping compiler is used for it.

alpmestan commented 6 years ago

@snowleopard We have this now, with GHC 8.2.2/cabal-install 2.0, which as I said above required to disable dynamic ways (I'm investigating). See here -- the first entry is green. I can't get the same result on appveyor though, there is a weird python-related error there that escapes my understanding at the moment.

snowleopard commented 6 years ago

@alpmestan This is not a complete build: you only build _build/stage0/bin/ghc, so all Stage1 libraries are skipped, as well as Stage2 GHC.

P.S.: Don't worry about AppVeyor or CircleCI yet.

alpmestan commented 6 years ago

Haaa, sorry, I was a bit slow. Just pushed a commit to build a complete stage2 compiler with this setup.

snowleopard commented 6 years ago

@alpmestan Looks like it's now failing with:

/home/travis/build/snowleopard/hadrian/ghc/_build/stage1/rts/build/libHSrts-1.0_debug.a: copyFile: does not exist (No such file or directory)
izgzhen commented 6 years ago

The Installing library in is probably printed from libraries/Cabal/Cabal/Distribution/Simple/Install.hs. And I could see that the ways in quickest flavour are not consistent with what that command would ask for (debug flavour is not in quickest, but required).

angerman commented 6 years ago

@izgzhen good find! Indeed the rts.cabal file specifies that the following flavours are supposed to be built:

extra-library-flavours: _debug _l _thr _thr_debug _thr_l

and in profiling mode the following as well:

    if flag(profiling)
      extra-library-flavours: _p _thr_p

from the mk/config.mk file: here's the suffix explanation:

# In addition, the RTS is built in some further variations.  Ways that
# make sense here:
#
#   thr           : threaded
#   thr_p         : threaded + profiled + eventlog
#   debug         : debugging + eventlog
#   thr_debug     : debugging + threaded, + eventlog
#   l             : eventlog
#   p             : profiled + eventlog
#   thr_l         : threaded + eventlog

indeed, setting BuildFlavour = quickest, yields:

$ make show! VALUE=rts_WAYS
rts_WAYS="v  dyn l debug thr thr_debug thr_l debug_dyn thr_dyn thr_debug_dyn l_dyn thr_l_dyn"

So we probably have to issues at hand:

I do however see the benefit of just building the vanilla library. So maybe we want to modify the rts.cabal to be more fine grained. And say allow the following flags: threaded eventlog dynamic profiling

where threaded, eventlog are on by default. dynamic is controlled by the dynamic flag; similar to what profiling is right now. And you could configure the rts with -threaded -eventlog to get just the vanilla way?

angerman commented 6 years ago

This (the last comment) is probably also why dynamic doesn't work.

angerman commented 6 years ago
Lint checking error - value has changed since being depended upon:
  Key:  /usr/bin/gcc
  Old:  Just File {mod=0x6FF74940,size=0xBD6D0,digest=0x97CD0DD7} recomputed
  New:  File {mod=0xBA440010,size=0xBD6D0,digest=0x22E1F908}

Has anyone seen this?

angerman commented 6 years ago

Apart from that, it built linux and mac to the end.

snowleopard commented 6 years ago

Right, as far as I can see we have the following on Travis now:

angerman commented 6 years ago

No. The separate selchtest is no omission. selftest+build exceeded the allows time on Travis. The seldtest alone took ~10min. As such I split the selftest+build into selftest and build.

The Linux builds fail due to that detected gcc file change as mentioned about. They do however complete the build prior to bailing out.

Adding the 1+2 test is certainly within reach.

snowleopard commented 6 years ago

@angerman Ah, I see, thanks for clarifying!

The lint error makes no sense to me:

Lint checking error - value has changed since being depended upon:
  Key:  /usr/bin/gcc
  Old:  Just File {mod=0x6FF74940,size=0xBD6D0,digest=0x97CD0DD7} recomputed
  New:  File {mod=0xBA440010,size=0xBD6D0,digest=0x22E1F908}

This essentially says that /usr/bin/gcc has changed during the build.

But how? We are not writing to /usr/bin, or are we?

alpmestan commented 6 years ago

Mac OS: we run the complete build (hurrah!), but we do not test the resulting executable via -e 1+2.

Fixed in latest commit.

And I have no idea of what's going on with this /usr/bin/gcc change. None whatsoever.

alpmestan commented 6 years ago

The OS X build failed with:

The job exceeded the maximum time limit for jobs, and has been terminated.
snowleopard commented 6 years ago

@alpmestan OK, I think I'm done with a first review round. Fantastic work, many thanks!

Hope my comments would not take too much effort to address.

snowleopard commented 6 years ago

I've spotted a couple of changes which seem to be unrelated to the main purpose of this PR. I think it would be best to remove them from the PR and discuss/implement separately.

The above changes appear to only add unnecessary noise to the review and also obscure this PR.

Let's focus on switching to the Cabal library from ghc-cabal and making build root configurable.

izgzhen commented 6 years ago

a few more about @snowleopard 's final comment:

Modification of the quickest flavour.

This is related to https://github.com/snowleopard/hadrian/pull/531#issuecomment-375530927. And the latest comment is https://github.com/snowleopard/hadrian/pull/531#discussion_r176635302.

Commenting out installation rules.

If we plan to use relocatable build which could simplify 80% of the current install away, then my suggestion would be deleting this rule entirely instead of comment it out. We should avoid commented code.

snowleopard commented 6 years ago

@izgzhen Thanks for the pointer. I agree with changing the quickest flavour.

I also agree that it's better to avoid committing commented out code. Better create an issue linking to this PR and explaining what needs to be done.

alpmestan commented 6 years ago

Alright, I addressed a whoooooole bunch of feedback. But I'm a bit stuck with the Settings.Packages refactoring, see https://github.com/snowleopard/hadrian/pull/531#discussion_r177827946.

snowleopard commented 6 years ago

@alpmestan Huge thanks for implementing all the changes! I hope you are not completely exhausted by this PR yet -- we are getting really close :)

alpmestan commented 6 years ago

@snowleopard Well, as you can see I'm taking the liberty to call for some help from you or Moritz when appropriate, so it's not all just me :-) And yes, we're getting there!

alpmestan commented 6 years ago

I just pushed a commit that implements your suggestion for fixing -c, and I tweaked the script that boots with 8.2.2 to not ./boot && ./configure but use -c instead. Let's see how that goes :)

snowleopard commented 6 years ago

@alpmestan Looks like it worked -- we've got a successful build!

alpmestan commented 6 years ago

Looks like all builds did the explicit ./boot && ./configure though, despite me changing one of them to use -c instead? Maybe it's just late and my eyes are not working properly...

snowleopard commented 6 years ago

@alpmestan Oh, good catch, I didn't spot this. Let's retry with all ./boot --hadrian && ./configure dropped?

alpmestan commented 6 years ago

Pushed.

alpmestan commented 6 years ago

@snowleopard Looks like that didn't do the trick: https://travis-ci.org/snowleopard/hadrian/jobs/359630388#L1193