Closed alpmestan closed 6 years ago
Sounds good @alpmestan, thank you!
Hmm... Circle CI fails with:
Command: _build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id ghc-boot-th-8.7' '-package-id base-4.12.0.0' -i -i_build/stage1/libraries/ghc-boot-th/build -i_build/stage1/libraries/ghc-boot-th/build/autogen -ilibraries/ghc-boot-th/. -Iincludes -I_build/generated -I_build/stage1/libraries/ghc-boot-th/build -I/Users/distiller/hadrian/ghc/_build/stage1/lib/x86_64-osx-ghc-8.7.20181009/base-4.12.0.0/include -I/Users/distiller/hadrian/ghc/_build/stage1/lib/x86_64-osx-ghc-8.7.20181009/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/ghc-boot-th/build/autogen/cabal_macros.h -optc-fno-stack-protector -odir _build/stage1/libraries/ghc-boot-th/build -hidir _build/stage1/libraries/ghc-boot-th/build -stubdir _build/stage1/libraries/ghc-boot-th/build -Wnoncanonical-monad-instances -optc-Wno-unknown-pragmas -c libraries/ghc-boot-th/GHC/ForeignSrcLang/Type.hs -o _build/stage1/libraries/ghc-boot-th/build/GHC/ForeignSrcLang/Type.o -O0 -H64m -XHaskell2010 -XNoImplicitPrelude -ghcversion-file=/Users/distiller/hadrian/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags
Exit code: 1
Stderr:
libraries/ghc-boot-th/GHC/ForeignSrcLang/Type.hs:1:1: error:
Bad interface file: /usr/local/Cellar/ghc/8.4.3/lib/ghc-8.4.3/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
mismatched interface file versions (wanted "80720181009", got "8043")
|
1 | {-# LANGUAGE DeriveGeneric #-}
| ^
)
So does AppVeyor which shouldn't have been affected by this patch, looks like the stage0 compiler that stack is installing is broken?
Well, Circle CI uses the cabal script I think? And installs ghc from homebrew.
Usually when this happens there's some kind of ABI mismatch. I assume the builds aren't cached? The linux build doesn't seem to suffer the same issue. why would also AppVeyor be affected? this only changes the CircleCI script no?
Yes, and I don't know why appveyor is affected.
When I try to reproduce the problem I get... other failures. It's just one of those days/weeks...
Edit: now I cannot reproduce my own errors anymore.
I think I know where the circle ci failure comes from: --integer-simple
... I'll remove it temporarily to see how things go. I wonder if https://github.com/ghc/ghc/commit/fc2ff6dd7496a33bf68165b28f37f40b7d647418 has anything to do with this.
Haha! New failure:
| Run Ghc CompileHs Stage1: libraries/ghci/GHCi/Message.hs => _build/stage1/libraries/ghci/build/GHCi/Message.o
| Run Ghc LinkHs Stage1: _build/stage1/utils/hpc/build/Main.o (and 11 more) => _build/stage1/bin/hpc
ld: library not found for -lgmp
clang: error: linker command failed with exit code 1 (use -v to see invocation)
`gcc' failed in phase `Linker'. (Exit code: 1)
shakeArgsWith 0.000s 0%
Function shake 0.432s 0%
Database read 0.002s 0%
With database 0.000s 0%
Running rules 932.053s 99% =========================
Total 932.487s 100%
Error when running Shake build system:
* _build/stage1/bin/hpc
user error (Development.Shake.cmd, system command failed
Command: _build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-package-id array-0.5.2.0' '-package-id base-4.12.0.0' '-package-id containers-0.6.0.1' '-package-id directory-1.3.3.0' '-package-id filepath-1.4.2.1' '-package-id hpc-0.6.0.3' -i -i_build/stage1/utils/hpc/build -i_build/stage1/utils/hpc/build/hpc/autogen -iutils/hpc/. -Iincludes -I_build/generated -I_build/stage1/utils/hpc/build -I/Users/distiller/hadrian/ghc/_build/stage1/lib/x86_64-osx-ghc-8.7.20181015/unix-2.7.2.2/include -I/Users/distiller/hadrian/ghc/_build/stage1/lib/x86_64-osx-ghc-8.7.20181015/time-1.8.0.2/include -I/Users/distiller/hadrian/ghc/_build/stage1/lib/x86_64-osx-ghc-8.7.20181015/bytestring-0.10.8.2/include -I/Users/distiller/hadrian/ghc/_build/stage1/lib/x86_64-osx-ghc-8.7.20181015/base-4.12.0.0/include -I/Users/distiller/hadrian/ghc/_build/stage1/lib/x86_64-osx-ghc-8.7.20181015/integer-gmp-1.0.2.0/include -I/Users/distiller/hadrian/ghc/_build/stage1/lib/x86_64-osx-ghc-8.7.20181015/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/utils/hpc/build/hpc/autogen/cabal_macros.h -optc-fno-stack-protector -odir _build/stage1/utils/hpc/build -hidir _build/stage1/utils/hpc/build -stubdir _build/stage1/utils/hpc/build -no-auto-link-packages -rtsopts -optl-lgmp -Wnoncanonical-monad-instances -optc-Wno-unknown-pragmas _build/stage1/utils/hpc/build/Main.o _build/stage1/utils/hpc/build/HpcParser.o _build/stage1/utils/hpc/build/HpcCombine.o _build/stage1/utils/hpc/build/HpcDraft.o _build/stage1/utils/hpc/build/HpcFlags.o _build/stage1/utils/hpc/build/HpcLexer.o _build/stage1/utils/hpc/build/HpcMarkup.o _build/stage1/utils/hpc/build/HpcOverlay.o _build/stage1/utils/hpc/build/HpcReport.o _build/stage1/utils/hpc/build/HpcShowTix.o _build/stage1/utils/hpc/build/HpcUtils.o _build/stage1/utils/hpc/build/Paths_hpc_bin.o -o _build/stage1/bin/hpc -O0 -H64m -XHaskell2010 -ghcversion-file=/Users/distiller/hadrian/ghc/_build/generated/ghcversion.h
Exit code: 1
Stderr:
ld: library not found for -lgmp
clang: error: linker command failed with exit code 1 (use -v to see invocation)
`gcc' failed in phase `Linker'. (Exit code: 1)
)
@snowleopard Could we arrange for libgmp to be available? At least to make CI work, then I can investigate the --integer-simple
problem.
@alpmestan Thank you for investigating this further! I'm back to Newcastle and hope to make some progress on Hadrian this week.
Could we arrange for
libgmp
to be available?
But Hadrian is supposed to build it if it's not available. So, it looks like the build is currently broken.
Note that we use --integer-simple
on some CI instances to make sure this mode is not broken.
If I take the tip of ghc's master branch of today, revert https://github.com/ghc/ghc/commit/fc2ff6dd7496a33bf68165b28f37f40b7d647418 and build with --integer-simple
, it works.
So... maybe
src/Settings/Builders/Ghc.hs:116: , libraryPackage ? arg ("-this-unit-id " ++ pkgId)
should be updated to special-case integer-gmp/integer-simple? But shouldn't the ghc-options
changes from both cabal files in the GHC commit I linked to take care of that?
But shouldn't the
ghc-options
changes from both cabal files in the GHC commit I linked to take care of that?
Yes, that would be my expectation. Since there are no changes to the Make build system, I suppose there shouldn't be any changes in Hadrian either... Oh wait, there is one change in compiler/ghc.mk
. Do you think it's relevant?
I have no idea. My nightmare of a week (last week, where I investigated why Cabal won't install my additional dynamic RTS flavours, to no avail) managed to convince me I don't know anything, it's all just an illusion and some occasional luck. =)
More seriously, the change in question removes 2 lines from the Config.hs
file generated by the make build system. Searching for what got removed, cIntegerLibrary
, in that commit's diff, doesn't yield anything. I suspect this is not the problem (might be wrong though, cf last week).
The commit message says:
This patch adds a new
DynFlag
that configures whichinteger
library to use. This flag is initialized bycIntegerLibraryType
(as before), and is only used inCorePrep
to decide whether to useS#
or not.The other code paths that were varying based on
cIntegerLibraryType
are no now longer varying: The trick is to useinteger-wired-in
as the-this-unit-id
when compiling eitherinteger-gmp
orinteger-simple
.
And given the ghc-options
we mentionned earlier, which should arrange for hadrian to add the right -this-unit-id
to the command line, I think we're good on that front. The only reason for this to still fail that I see is that the Settings.Builders.Ghc
snipped I mentionned above also adds a -this-unit-id
, but a different one, and I suppose GHC just picks it up instead of integer-wired-in
.
The only reason for this to still fail that I see is that the
Settings.Builders.Ghc
snipped I mentionned above also adds a-this-unit-id
, but a different one, and I suppose GHC just picks it up instead ofinteger-wired-in
.
@alpmestan I think you may be right. Could you try adding a special case to use integer-wired-in
instead of pkgId
?
I checked and indeed GHC receives two -this-unit-id
args when building integer-simple modules and registering the package. One with just the typical package id, and the one from the ghc-options
field of the .cabal file. So I made this patch:
diff --git a/src/Settings/Builders/Ghc.hs b/src/Settings/Builders/Ghc.hs
index 8212b5f..082e6c8 100644
--- a/src/Settings/Builders/Ghc.hs
+++ b/src/Settings/Builders/Ghc.hs
@@ -113,7 +113,12 @@ packageGhcArgs = do
mconcat [ arg "-hide-all-packages"
, arg "-no-user-package-db"
, packageDatabaseArgs
- , libraryPackage ? arg ("-this-unit-id " ++ pkgId)
+ , libraryPackage ?
+ -- those two packages specify 'integer-wired-in' as the unit id
+ -- in their .cabal files (ghc-options field), so we need to omit
+ -- this argument just for them.
+ not (package `elem` [integerSimple, integerGmp]) ?
+ arg ("-this-unit-id " ++ pkgId)
, map ("-package-id " ++) <$> getContextData depIds ]
includeGhcArgs :: Args
but it apparently isn't enough, I've rebuilt everything from scratch but still see this error:
$ hadrian/build.sh -j4 --flavour=quickest -o_quickest --integer-simple
[...]
Error when running Shake build system:
* _quickest3/stage1/lib/package.conf.d/ghc-boot-th-8.7.conf
* _quickest3/stage1/libraries/ghc-boot-th/build/libHSghc-boot-th-8.7.a
* _quickest3/stage1/libraries/ghc-boot-th/build/GHC/Lexeme.o
* _quickest3/stage1/libraries/ghc-boot-th/build/GHC/Lexeme.o _quickest3/stage1/libraries/ghc-boot-th/build/GHC/Lexeme.hi
user error (Development.Shake.cmd, system command failed
Command: _quickest3/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _quickest3/stage1/lib/package.conf.d' '-this-unit-id ghc-boot-th-8.7' '-package-id base-4.12.0.0' -i -i_quickest3/stage1/libraries/ghc-boot-th/build -i_quickest3/stage1/libraries/ghc-boot-th/build/autogen -ilibraries/ghc-boot-th/. -Iincludes -I_quickest3/generated -I_quickest3/stage1/libraries/ghc-boot-th/build -I/nix/store/0d2ajw9gi57cc1iv2hzfi07d54zfrfq8-ghc-build-environment/include -I/home/alp/WT/ghc-diffs/_quickest3/stage1/lib/x86_64-linux-ghc-8.7.20181015/base-4.12.0.0/include -I/home/alp/WT/ghc-diffs/_quickest3/stage1/lib/x86_64-linux-ghc-8.7.20181015/rts-1.0/include -I_quickest3/generated -optc-I_quickest3/generated -optP-include -optP_quickest3/stage1/libraries/ghc-boot-th/build/autogen/cabal_macros.h -optc-fno-stack-protector -odir _quickest3/stage1/libraries/ghc-boot-th/build -hidir _quickest3/stage1/libraries/ghc-boot-th/build -stubdir _quickest3/stage1/libraries/ghc-boot-th/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/ghc-boot-th/GHC/Lexeme.hs -o _quickest3/stage1/libraries/ghc-boot-th/build/GHC/Lexeme.o -O0 -H64m -XHaskell2010 -XNoImplicitPrelude -ghcversion-file=/home/alp/WT/ghc-diffs/_quickest3/generated/ghcversion.h -Wno-deprecated-flags
Exit code: 1
Stderr:
ghc: panic! (the 'impossible' happened)
(GHC version 8.7.20181015 for x86_64-unknown-linux):
lookupGlobal
Bad interface file: /nix/store/g5ggsbqkwqiiq3qkpw8ji42axrna117y-ghc-8.4.3/lib/ghc-8.4.3/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
mismatched interface file versions (wanted "80720181015", got "8043")
Call stack:
CallStack (from HasCallStack):
callStackDoc, called at compiler/utils/Outputable.hs:1160:37 in ghc:Outputable
pprPanic, called at compiler/typecheck/TcEnv.hs:132:32 in ghc:TcEnv
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
)
mismatched interface file versions (wanted "80720181015", got "8043")
Looks like we somehow manage to mix up interface files built in different stages -- I have no idea how...
Well, keep in mind that reverting https://github.com/ghc/ghc/commit/fc2ff6dd7496a33bf68165b28f37f40b7d647418 is sufficient for making everything work again. So we're probably not too far from a fix, I'm just failing to see what else we should change right now.
I figured it out. Will create an issue explaining the problem and the potential solutions.
Ping.
@alpmestan I lost track of this -- have you submitted the findPtr
patch you mentioned? In any case, we should probably close off all leftover PRs here...
@Mistuke Oh right, the other fix got merged only recently. I suppose I now have to create a differential for https://github.com/alpmestan/ghc/commit/4c37000bd65befcfd8331ed19dd46eb478ecf683 -- and... that's it, I think. Hadrian already passes the right flag.
@Mistuke Could you try https://phabricator.haskell.org/D5356? I made you a reviewer.
@alpmestan thanks for making the diff! the patch will run through validation tonight with my other stuff, I'll post a reply tomorrow. Cheers.
@Mistuke reported that the Windows tests got fixed with the patch, woohoo!
Awesome, many thanks 🎉
This is an alternative solution to https://github.com/snowleopard/hadrian/issues/614.
@Mistuke pointed out here that the solution I went with (always include findPtr/_findPtr) has its own problems, and suggested a better one. I implemented just that in https://github.com/alpmestan/ghc/commit/f78e0ae443f48bb2c58819f04d781a3d27c2d528.
Note that the hadrian build should "just" pick up the fix, since we already pass the
debug
flag whenever appropriate for ourextra-library-flavours
business. Let's see what the OS X builds say. When I get them to work, I'll submit the GHC patch, and when it's merged, we can just close this PR I suppose, since there's nothing to change in hadrian land. :-)