Open david-christiansen opened 1 year ago
I think it's worth it.
Agreed, it's a great idea and we'd love for somebody to integrate this with cabal.
Let's discuss how and where it could be integrated. E.g., cabal now consists of 4 packages: Cabal, Cabal-syntax, cabal-install, cabal-install-solver. I'd assume the Cabal package errors may need to be somehow linked to GHC errors, especially if GHC may emit them when it calls Cabal functions. Solver errors are particularly large and, if represented well, useful. Perhaps another split than per-package makes more sense, or no split at all.
For GHC and GHC tools (ghc-pkg, anything else?), I think right now cabal just let's GHC display its error messages, not intervening in any way, and saving the errors and warnings in a log file together with cabal's own messages. If GHC additionally outputs errors.haskell.org links, I don't think anything would need to change, except a few places where the GHC errors are recorded in test output and possibly many more places where warnings are.
Another angle would be adding all cabal errors to the index. That would require a design and an overhaul of at least the die
or die'
machinery, optimistically assuming we don't ever error
instead.
Any other angles?
For GHC I think there's nothing that needs to be done -- this is just an independent improvement to GHC's messages.
It's more that we could attempt to refactor the cabal system along the same lines as ghc -- cleaning up the current ad-hoc mechanism and moving to representing cabal errors as a sum type with a mapping from constructors to error codes, such that those could also be documented on the site. I.e. I think the "another angle" is all this proposal is really about. I think even absent the error index, the cabal error reporting mechanism really could use a full overhaul regardless. But also, this is a fairly large project, and would need an interested person to take it up and push it forward -- perhaps as a future GSoC project.
Stack done it swiftly! Here's their:
Stack was in a markedly better position because they have structured errors (we have a recent ticket for it and no volunteers on the horizon: #8618). They already assigned all the codes to the corresponding data constructors for their errors (see their docs). Then, they updated the error messages by sticking those codes, and the messages are handily localized in the Show instances for those type constructors.
What remains there is to populate errors.haskell.org. But this can be done continuously in the coming future.
I added more details on how Stack did it in the previous comment. This may serve as a possible plan for us.
Nothing blocking this ticket, so I remove the "blocked: ghc" label.
This would very useful perhaps especially as Haskell has less documentation of errors on stackoverflow than other more popular languages, so having a centralised collection of them would be great.
I read through the ticket on how Stack did this, and it seems like for them at least it was relatively straightforward, but their errors were already represented in a way that made it more suitable to implement which cabal doesn't really do (I think?). I would definitely be interested in learning more about this in the upcoming weeks and perhaps doing it as part of a google summer of code project though.
Amazing! @gbaz: I forgot the details of our next GSCO that you initiated. Is there a webpage about that already?
Yay. So cool.
This is great to see!
I have been looking into how Stack made the transition to have structured errors. It seems that they already had datatypes for some of their exceptions, which they expanded by grouping similar errors as constructors of these datatypes, with a nice show
instance for every exception. After that they added unique error codes to every show
instance. Their errors are now integrated with the error index, although actually documenting is not going super quickly yet: only two errors are documented right now. So perhaps as part of this project at the end it would be good to also actually get the documenting of cabal errors started.
For cabal, making this transition would take more work as I believe that all errors are done using die'
and exceptions don't have specific types to distinguish them, so there is no underlying structure for errors yet.
I would guess that the biggest part of this project is moving away from die'
to creating types for exceptions that can be re-used and assigned a specific error code.
While reading through the die'
function, I also came across this note:
"When we die'
, we must raise an IOError. This a backwards
compatibility consideration, because that's what we've raised
previously, and if we change to any other exception type,
exception handlers which match on IOError will no longer work.
One case where it is known we rely on IOError being catchable
is 'readPkgConfigDb' in cabal-install; there may be other
user code that also assumes this."
This might be an issue if we would want to move to different datatypes for exceptions. Another consideration might be how to handle Verbosity
with different exception types, as this is now simply passed as a parameter to die'
. Overall, it seems like quite an overhaul would be required, especially compared to what Stack had to change. Once it is done and working though, it would give a nicer representation of errors that can easily be extended in the future when new errors arrive. What do others think?
This might be an issue if we would want to move to different datatypes for exceptions.
How do other tools do it? GHC, stack? And what is the requirement for the error messages index? Does it require a particular stderr output with numbers in square brackets or is there an expectation that a particular exception type value can be caught and the payload inspected?
From what I gather, only a stderr output with the particular code is strictly required for the integration with the error index. Could you perhaps confirm this @david-christiansen ? However, bringing more structure to errors in cabal would help with recognising similar errors and assigning codes to them, whilst also being generally beneficial. GHC also moved to structured error data types with constructors from semi-structured strings especially so that other tools like HLS that work with GHC diagnostics could take advantage of the structure instead of parsing the output of the error, and something similar for cabal could be useful.
Stack now implements exceptions as data constructors of sum types, which are created in any module that needs to export exceptions, so generally one type per module with several constructors and arguments to each constructor, and they import these in other modules as needed. They then derive a Show
instance for each exception, and a Control.Exception.Exception
instance. They use functions from Control.Exception.Exception
to throw their errors, mostly throwIO
, which returns `IO a'. They also have a 'PrettyException' for different cases, but I think their plain exceptions are a good start to take inspiration from.
I think in cabal it should also be possible to move from die
, which returns IO a
as well, to the functions given by Control.Exception.Exception
. The first requirement would be to document the errors in cabal and see if we could fit them in sum datatypes per module as well, and then see how we could implement the showing of the exception together with verbosity
For the error message index, the requirements and recommendations for tool developers are documented here: https://github.com/haskellfoundation/error-message-index/blob/main/tool-developers.md
In summary, the requirements are:
There are no requirements on the API that the tool provides to other developers - the error message index is strictly a user-facing tool.
@BasLaa is mostly correct - there is no requirement that it be on stderr
, it's fine to emit things to stdout
or a GUI window or whatever makes sense for a given application.
In that case, while using bespoke exceptions is certainly natural, we don't have to do this or to focus on this. We can have a sum type of errors or several such sum types, per package or per component in our repo. We can assign starting codes to each of these and number independently. Eventually, we can either throw a sum type exception, or a dumb exception with a message generated by inspecting the sum type. What I'm saying, using exceptions is an implementation detail that should not hinder coming up with an elegant design.
Yeah I definitely agree, and like you said, the starting point would be documenting all the existing errors and mapping them onto several sum types.
I read that Haskell sadly wasn't accepted as an organisation for Google summer of code this year, but they let me know that they are planning to organise a summer of Haskell separately, so if that works out well I could write a proposal for this project as part of that instead.
Since Haskell.org is organising a separate summer of haskell, I have started to write a proposal for the project of integrating cabal errors with the haskell error index. The deadline for sending in proposals is April 20th, but I will try to get a draft proposal done for some feedback hopefully in around a week. As I have exams this week I don't have as much time to work on it in the next couple of days.
Amazing. Please keep us posted. The IRC/Matrix channel #hackage may also be a good place to coordinate. Good luck with your exams. :)
i have written a draft proposal on google docs here. I believe anyone should be able to leave comments. I am curious to hear your opinions, especially what you think of the implementation.
In my proposal I proposed creating instances of the Exception
typeclass for each sum type representing some errors and then using the throwIO
functions for throwing errors, in a similar way to how Stack implemented their errors. This requires a Show
instance for the type as well, but the show method for an error would be dependent on the verbosity, so this would need to be added to the data constructor of the error. However, @david-christiansen noted that verbosity is more a feature of the context of the error than the error itself, so it would not really make sense to have this be part of the constructor.
Perhaps an alternative would be to have every error type implement a function errorToString :: E -> Verbosity -> String
, and then use the userError
and ioError
functions, that are currently used as well, to throw the errors so that an instance of Exception isn't needed.
Has anyone looked into how GHC did it? Its code base is much closer to what cabal has than the stack code base, i think. (In that the starting point was completely unstructured errors.)
I believe this article outlines how they created a new structure for their errors: https://well-typed.com/blog/2021/08/the-new-ghc-diagnostic-infrastructure/. I'm not sure if I'm understanding it correctly, but I believe they are not working with printing dependent on some verbosity parameter here.
@BasLaa correct, thanks for confirming. I was more curious weather they went through the Exception
typeclass and throwIO
or something else.
Also, while it's true that they don't seem to print depending on verbosity, they have a dynamic parameter Severity
(see the blog post) that form a part of a "message" that also holds the actual error/warning-encoding constructor. You could imagine a message type that records the current verbosity level...
Also, GHC does different vebosity levels (e.g. -v0
). I wonder how it fits with their new framework.
I wanted to give an update on this: My proposal for summer of Haskell ended up being accepted, however in the end I decided to accept a different opportunity for over the summer, so I will not be working on this project over the summer. However, I would still be interested in being involved in working on this after the summer in my spare time when I have more time, if no one else has done this project at this point. Hope you all understand!
@BasLaa no worries and have a great summer!
That's fine. Let's keep in touch and happy hacking!
Hi, I am working on this project as part of summer of Haskell.
I am new to the Haskell community and just setting up the Environment and parallely looking into the relevant tickets and the codebase. This ticket has lots of good discussion on the project.
Excited to collaborate! Thanks.
@SuganyaAK: Hi! Welcome!
Please also consider joining the #hackage channel via IRC or Matrix and joining our fortnightly video (or sound only) chat. If you send me your email to mikolaj@well-typed.com, I will add you to an invitation to the chat.
Hi, I am the other David from the error-message-index. I have seen that in some of the already merged PRs (e.g. https://github.com/haskell/cabal/pull/9018/files?show-viewed-files=true&file-filters%5B%5D=) there are already errors which use the [Cabal-XXXX]
format. Are these error codes already emitted in a released version of cabal? If so, I can add the Cabal-
prefix to the list in our README-md
Hi David! Not in a release, no. It will likely appear in Cabal 3.12. Which will be released maybe right after the upcoming 3.10.2, but that one won't have it either.
Is anyone watching this issue able to review https://github.com/haskellfoundation/error-message-index/pull/536?
Review, yes. Review that counts, no: I don't have write access in that repo.
Are these error codes already emitted in a released version of cabal?
@BinderDavid I checked Cabal-7070 and it was only cabal-install-3.12.1.0
that included it. Earlier releases in the 3.10.*
series didn't.
Thanks for the review @geekosaur.
Yes, they are released as of 3.12. Updating the error index with them was left as future work, though (hence this PR).
Yup, I am still watching the error-message repo. I am just winding down from a week of ICFP and a stressful few weeks prior :) I unfortunately didn't have as much time as I wanted to review PRs in that period. But I will have more time now. If there are more PRs planned to add Cabal error messages to the index I would also be grateful for any help reviewing those.
Happy to review cabal submissions on the error index repo but don't have the permissions to formally approve.
Happy to review cabal submissions on the error index repo but don't have the permissions to formally approve.
Thanks! I gave you necessary permissions :)
Thanks!
Should this issue be closed as completed btw?
I wouldn't think so: this is only one of the many Cabal and cabal-install errors.
Well, the issue doesn't request to create the error-index part for Cabal, it only asks to "integrate" with the error index. And following its own definition of "integration", I think, the work is done.
As for populating the error index, I'm not sure what good it will do to leave this open...
But I'm not against leaving it open either. Just wanted to understand the current purpose of it.
errors.haskell.org presently documents error messages and warnings from GHC, starting with version 9.6.1 once it's released. The design, however, is intended to accommodate any Haskell tooling whatsoever. Would Cabal be interested in integrating with the site?
What it entails:
[C-12345]
, where "C" is a namespace (we can talk about what it would be) and "12345" is a number that's unique in that namespaceAs I see it, there'd be the following benefits:
The costs would likely be:
What do you think? Is this interesting?