Open bgamari opened 4 years ago
The symptom that I was seeing is this: when building with a GHC 8.10 alpha, eg using a stack file like this, you see a lot of "Failed to decode" warnings, one per .hi file. Eg:
~/src/hledger$ stack --stack-yaml=stack-ghc8.10.yaml build --fast hledger-lib
Stack has not been tested with GHC versions above 8.6, and using 8.10.0.20191210, this may fail
Stack has not been tested with Cabal versions above 2.4, but version 3.1.0.0 was found, this may fail
Warning: Failed to decode module interface: /Users/simon/src/PLAINTEXTACCOUNTING/hledger/hledger-lib/.stack-work/dist/x86_64-osx/Cabal-3.1.0.0/build/Text/Megaparsec/Custom.hi Decoding failure:
Invalid magic: e49ceb0f
Warning: Failed to decode module interface: /Users/simon/src/PLAINTEXTACCOUNTING/hledger/hledger-lib/.stack-work/dist/x86_64-osx/Cabal-3.1.0.0/build/Hledger.hi Decoding failure:
Invalid magic: e49ceb0f
...
Warning: Failed to decode module interface: /Users/simon/src/PLAINTEXTACCOUNTING/hledger/hledger-lib/.stack-work/dist/x86_64-osx/Cabal-3.1.0.0/build/Text/Tabular/AsciiWide.hi Decoding failure:
Invalid magic: e49ceb0f
WARNING: Ignoring hledger-lib's bounds on base (>=4.9 && <4.14); using base-4.14.0.0.
Reason: allow-newer enabled.
~/src/hledger$
Observed on mac catalina and on ubuntu 18.04.3, using stack 2.1.3. I wrongly assumed these were coming from the new element in the toolchain (GHC), so I went and bothered #ghc.
@haitlahcen probably you could add your 5 cents here as the author of #4545 ?
@qrilka
The problem was that Stack was trying to extract modules dependencies from the dumped .txt
hi file (which was obviously not a machine readable format, causing various issues). We reversed the interface format (not documented by GHC...) to directly read the binary file.
I believe that the proposal @bgamari linked is pretty much what was missing from GHC side to avoid doing what we have done here. Thanks for having noticed the link. I guess that we will be able to extract everything for the generated format as it seems to contain all the informations we need (@snoyberg confirmation would be great here).
Finally, removing the dependency to hi-file-parser
is (probably) just a matter of updating this function to directly extract the dependencies from the GHC's generated machine readable file.
If all you need is intra-package module dependencies then it's also possible that GHC's -M
mode will suffice. This produces Makefile-style output which encodes the dependency graph. The extended dependency proposal is intended to provide a more build-system agnostic format with information that the -M
output currently lacks (e.g. inter-module dependencies, required language extensions, etc.)
@hussein-aitlahcen, could you describe precisely which information you extract from the interface file? I see mentions of dependency file names in the implementation but it's hard to say precisely what kind of dependencies we are talking about here (e.g. native CPP dependencies, Haskell modules within a package, inter-package Haskell dependencies, etc.).
@bgamari I believe it is intra-package module deps with native TH/CPP dependencies, see the previous version of the parsing where we extract addDependentFile
and the module dependencies
. I've an example of the dumped hi file we were parsing here given this kind of source.
Alright, it would be appreciated if you could engage on the GHC proposal, raising these points so we have a record of precisely what we are designing towards.
Is there a workaround for this? I'm getting errors like this when trying build Liquid Haskell:
Warning: Failed to decode module interface:
/home/omf/code/haskell/liquid/liquidhaskell/.stack-work/dist/x86_64-linux-tinfo6/Cabal-3.2.0.0/build/Language/Haskell/Liquid/UX/Tidy.hi Decoding failure:
Invalid magic: e49ceb0f
Warning: Failed to decode module interface:
/home/omf/code/haskell/liquid/liquidhaskell/.stack-work/dist/x86_64-linux-tinfo6/Cabal-3.2.0.0/build/Language/Haskell/Liquid/WiredIn.hi Decoding failure:
Invalid magic: e49ceb0f
Warning: Failed to decode module interface: /home/omf/code/haskell/liquid/liquidhaskell/.stack-work/dist/x86_64-linux-tinfo6/Cabal-3.2.0.0/build/LiquidHaskell.hi
Decoding failure: Invalid magic: e49ceb0f
-- While building package liquidhaskell-0.8.6.0 using:
/home/omf/.stack/setup-exe-cache/x86_64-linux-tinfo6/Cabal-simple_mPHDZzAJ_3.2.0.0_ghc-8.10.1 --builddir=.stack-work/dist/x86_64-linux-tinfo6/Cabal-3.2.0.0 build lib:liquidhaskell exe:liquid --ghc-options " -fdiagnostics-color=always"
Process exited with code: ExitFailure 1
I tried specifying a different resolver but it wasn't happy (I'm not sure which one to pick).
There is no workaround right now. The best solution is to improve hi-file-parser to support the new binary format.
The interface magic change was a mistake (see https://gitlab.haskell.org/ghc/ghc/issues/18180) but hi-file-parser
needs to support S/ULEB128 decoding anyway.
I was pointed to this ticket recently. To summarize the current state of affairs from the GHC side:
Going forward the plan for GHC itself (starting with 8.10.2) is to:
As a workaround for 8.10.2 hi-file-parser will need to:
I regret having broken this for stack. But ghc can't reasonably guarantee that the iface format won't change again in the future. There has been discussion about trying to apply compression to them as well as encoding certain values as sub-byte sequences of bits. If any of those turn out to be beneficial the format might change again radically in 8.14.
I would reiterate that it would be helpful if stack
could comment on the extended dependency generation proposal so we can confirm that it is indeed sufficient to solve the underlying issue here. It would be great to move stack off of hi-file-parser
.
One of the reasons we have hi-file-parser is because processing the text based hi-dump files caused massive slowdown both from GHC and Stack. I don't know if that problem would reemerge with JSON. We would also still need hi-file-parser for earlier versions of GHC. Finally, I don't know how Cabal would expose access to these files.
Most likely we'd be able to move over to parsing the new format, but I'm not certain.
@AndreasPK no concerns from my side. Changes in the hi binary file format is absolutely expected, we just need to update hi-file-parser to handle it. My only request would be if you know the format is changing and you remember to give a ping on that repo. But as you can see, we'll notice pretty quickly due to the warnings.
We had these problems before with the hi-dump format too, and moved to explicit and scary-looking warnings just to make sure it didn't silently get ignored.
That library supports different parsing for different GHC versions, so updating as you mention ideally won't be too difficult.
Looking at the code on this just a bit: it doesn't seem like it's really possible to continue using the hi-file-parser approach based on the changes you're planning. Ideally there would be a reliable way to tell which version of the format is being parsed immediately. However, if the magic number is changing back, we have no way of telling whether it's the new variable length encoding or fixed length encoding in the future.
@haitlahcen do you have any thoughts on all of this?
For context of the original change to use hi-file-parser, see #4545.
What is the status on this? I just upgraded to stackage nightly with GHC 8.10 and seeing a lot of "invalid magic" warnings. Why has stackage moved to 8.10 if it cannot cope with the 8.10 hi files yet - or maybe this is not a real problem besides printing a warning?
It's harmless, for now at least; but annoying/alarming for users. From snoyberg's last comment it sounds like the version of hi files format needs to be managed and exposed a bit more clearly on the GHC side.
Is there a way to ignore just this warning? Using GHC 8.10 with Stack outputs a lot of noise to my terminal. I want all of the warnings from Stack and GHC in general, but I'd like to disable this one because it's not actionable for me and everything seems to work regardless.
Same concern here :-)
I think I have a good solution to this warning issue. We want the warnings to appear to Stack devs, but no one else. What I think I'm going to do is set up a new config flag for "developer-mode" or something like that. The default for the official binaries will be to set it to off. The default when you build from Git will be for it to be on. When the flag is on, warnings like this will be set to WARN level. When the flag is off, these messages will be DEBUG level. That will allow Stack developers to hear about these issues, and users to ignore them.
@bgamari I don't know what other comments I can provide to you on the GHC proposal. From what I can see, it's not addressing the concerns that originally led to hi-file-parser to be written. I'd also rather not have this separate library, but it doesn't seem like we have any choice given the lack of a consistent, multi-GHC-version interface for determining TH dependent files.
Sounds good to me @snoyberg! I'm not a Stack developer, but the code in #5325 looks good to me.
In the meantime, for those going blind with ghc-8.10 development:
stack build --test --file-watch 2>&1 | sed '/^Warning:/,/Invalid magic: e49ceb0f$/d'
If you like a bit of color in your terminal, Tony's command can be done this way:
stack build --test --file-watch --color=always 2>&1 | sed -r '/^(\x1b\[[0-9;]*m)?Warning:(\x1b\[[0-9;]*m)? Failed to decode/,/e49ceb0f.*?$/d' 2>&1
@snoyberg With a few months of reflection, is this still an approach you consider good? I noticed that this is still an issue in stack 2.5.1. If the approach is one you'd still feel comfortable with, it sounds like it is a reasonably low touch change, that I could take a stab at. It sounds simple from your description, but this will be my first foray into the stack codebase, so I'm not sure what I'm in for exactly. I'm guessing the only reason it hasn't been done is due to not being prioritized given that 8.10.x isn't in the stackage LTS set.
Sorry, a few different things have come up in this thread. Which approach are you talking about here?
Overall, I'd support any implementation for this that lets Stack continue to parse the .hi files. If that comes from an upstream GHC proposal, that's great, but I don't think the one mentioned above will help.
For specificity, this one: https://github.com/commercialhaskell/stack/issues/5134#issuecomment-647091312
But your response just now indicates that this falls under the generalization you're amenable to.
Is this issue still valid? I encountered this today when building a project on ghc 8.10 but upon upgrading stack
(from 2.3.1
to 2.5.1
) it seems like these warnings are no longer appearing.
The change made in #5325 hides this warning from typical users. The problem still exists behind the scenes. You should be able to set stack-developer-mode: true
in your stack.yaml
to see the warnings.
Can be closed from a users perspective.
I didn't observe the warnings with stack 2.5.1, but they appeared after a stack upgrade
that got me to 2.5.1.1:
$ stack --version
Version 2.5.1.1 x86_64 hpack-0.33.0
This now blocks Stack from being used with GHC 9.0.1: #5486 #5445
I'd never encountered this issue in GHC <9, but today I encountered this with GHC 8.10.4 on macOS, in which stack silently dies and --verbose
flag reveals it. Is there any workaround, or plan for the fix in the near future?
The short term issue for GHC 9.0 should be addressed by #5508. I don't believe we have a long-term solution for getting the dirty file information out of GHC effectively yet, but with this workaround in place, it simply means that dirty file checking will not work correctly with addDependentFile
(the current status quo).
@bgamari I don't know what other comments I can provide to you on the GHC proposal. From what I can see, it's not addressing the concerns that originally led to hi-file-parser to be written. I'd also rather not have this separate library, but it doesn't seem like we have any choice given the lack of a consistent, multi-GHC-version interface for determining TH dependent files.
@snoyberg, I'm afraid I don't follow. The proposal does provide information about TH dependent files (via the dependencies
key). Can you clarify?
I think we're saying the same thing. Stack has a need here: learn about TH dependent files. That information is available in .hi
files. Stack needs to get that information. GHC currently doesn't provide a method for doing that directly. Previously, we use .hi-dump
files, but that caused issues because in some cases those files were massive. So someone contributed a binary parser, which is now broken due to changes in GHC's file format. You mentioned a GHC proposal, which I think we now both agree is irrelevant to the issue at hand. We're left with the original challenge: how do we efficiently determine dependent files inside of Stack?
For now, the "solution" is to simply fail on newer GHC versions, and hope that someone cares enough about this to add additional parsing capabilities to hi-file-parser
.
You mentioned a GHC proposal, which I think we now both agree is irrelevant to the issue at hand. We're left with the original challenge: how do we efficiently determine dependent files inside of Stack?
I'm trying to suggest that the proposal is relevant: it provides precisely the information that you are currently parsing out of the .hi
file.
You're right. I completely misread your previous line. Apologies. Yes, this should address the needs hi-file-parser
is filling going forward. We'd still need hi-file-parser
for existing GHC versions, but that shouldn't be an issue.
I've fixed hi-file-parser
(cf https://github.com/commercialhaskell/hi-file-parser/pull/2) so that it supports 8.10 and 9.0
Thank you @hsyl20! I've merged and released your PR, and will open up a PR against Stack shortly to use it.
Seems to work like a charm, I no longer see Invalid magic
messages! (stack 2.6.0
built from master
via stack upgrade --git
).
Seems like this can be closed now.
Seems like this can be closed now.
Sadly the underlying issue remains so that seem premature.
@hsyl20, would you be able to do what you did for GHC 9.0, but for GHC 9.4? See https://github.com/commercialhaskell/hi-file-parser/issues/5. I think the .hi format may have changed.
@mpilgrem Sure. I've responded in the ticket.
After an extended debugging session with @simonmichael it came to my attention that Stack directly parses GHC's interface files via the
hi-file-parser
library. This is an unfortunate state of affairs. Interface files are interface to GHC and consequently come with no particular stability guarantees.To make everyone's life easier, it would be better if we can find another way to expose the information that you need from interface files. Can you describe what information Stack needs that isn't currently available from other means? Also note that we have an open GHC proposal, the extended dependency information proposal, which may be helpful. We would love to have feedback on this.
I have opened GHC #17620 to track the GHC side of this.