snowleopard / hadrian

Hadrian: a new build system for the Glasgow Haskell Compiler. Now merged into the GHC tree!
https://gitlab.haskell.org/ghc/ghc/tree/master/hadrian
MIT License
208 stars 39 forks source link

_build/stage0/bin/ghc.exe misses dependency #654

Closed ndmitchell closed 5 years ago

ndmitchell commented 5 years ago

To reproduce, modify Parser.y to change one of the productions to undefined. Observe that _build/stage0/bin/ghc.exe doesn't rebuild.

The reason is the dependency chain reads:

When I modify Parser.y the build trickles all the way up so libHSghc-8.7.a changes, but because the ABI doesn't change, ghc-8.7.conf is regenerated by doesn't change. Because ghc.exe only depends on ghc-8.7.conf (and lots of other stuff that isn't relevant) it doesn't rebuild. Solution is that ghc.exe should also depend on _build/stage0/compiler/build/libHSghc-8.7.a directly.

ndmitchell commented 5 years ago

(I expect it also happens with at least the stage1/stage2 GHC binaries, and probably lots of other things...)

angerman commented 5 years ago

ghc-8.7.conf is regenerated by doesn't change.

Doesn’t that change the modification time? Is that insufficient to trigger invalidation?

—-

What we need to do then is to actually read the conf file and depend on all the referenced libraries from within?

ndmitchell commented 5 years ago

@angerman Hadrian explicitly sets ChangeModtimeAndDigest in https://github.com/snowleopard/hadrian/blob/master/src/Main.hs which means it is only considered changed if the contents change.

angerman commented 5 years ago

@ndmitchell I see. Just to inline the code for everyone to see on github here we go: https://github.com/snowleopard/hadrian/blob/4265e3aab7df92722b81148cf8bf3954ebfc2d21/src/Main.hs#L34-L40

Do we know what the impact on hadrian is when we allow modification time to trigger rebuilds?

All that said, I'm not much a fan of adding explicit dependencies to all targets. If we would automate that by reading the conf files and computing (discovering) the dependencies like that on the fly, that does seem to me like the better solution here.

snowleopard commented 5 years ago

@angerman The improvements thanks to using digests instead of modtimes are significant, e.g. see the last section of the Hadrian paper.

Parsing conf sounds plausible but complicated. Adding explicit dependencies is a simple baseline solution that, I think, should be preferred. There aren't too many of such dependencies anyway, are there?

ndmitchell commented 5 years ago

If you don't work with digests you aren't going to work with cloud caching - so it's worth getting it right as is.

angerman commented 5 years ago

I’ll try to get around to this later today. Let’s do this right and parse the dependencies automatically.

Alternatively I could imagine adding additional fields to he config file that more closely trac the contents of the package.

snowleopard commented 5 years ago

@angerman Before you write a lot of new code, could you show an example of the kind of dependency information you'd like to parse? Why is it not available statically? Is it really worth turning it into dynamic?

Hadrian already got a lot more complexity compared to the Make build system, which makes it hard to understand and maintain. We need to be careful about adding more.

snowleopard commented 5 years ago

To clarify the static vs dynamic bit: it's much easier to understand/debug build rules containing

need [x, y, z]

compared to

need =<< parse x
ndmitchell commented 5 years ago

I was assuming there was a 1-1 mapping between foo.conf and libHSfoo.a - if so parsing seems overkill. Where is it more complex than that?

angerman commented 5 years ago

Neil, that’s basically what I’m trying to go for. I believe this might get more complicated once cabals multi lib support lands. I’m not sure how fast that would be used within ghc though.

Andrey, I prefer not to have to write any code at all. I do prefer generic rules over hardcoded values as they provide the least surprise when the packages change. I think hadrian excels at what it does if it completely faded into the background for anyone working on ghc.

snowleopard commented 5 years ago

I do prefer generic rules over hardcoded values as they provide the least surprise when the packages change

I'm not suggesting we hard-code dependencies for each package separately. What we can do now is to simply depend on pkgLibraryFile, which we already defined and use elsewhere in the codebase.

Later, if/when Hadrian needs to support multiple libraries per package, we'll just switch to pkgLibraryFiles. Sounds simple and generic enough to me. How could this go wrong?

snowleopard commented 5 years ago

I reproduced and will attempt to fix today.

snowleopard commented 5 years ago

@ndmitchell It turned out that we did have all necessary dependencies, but due to a recently introduced bug they were accidentally dropped. To prevent such regressions in future I added some tests.

657 solves the issue for me.