Closed DamienCassou closed 2 years ago
I also have the feeling that, for maximum compatibility with tools, the first line should contain the word "error" and the error's position. Does that sound correct ?
We also want error messages to be readable on a standard 80-char terminal, so we usually have a challenge with long lines.
Perhaps it's not necessary to mention the program name.
And of course eventually we'd like to show a pretty, coloured human-friendly rendering of the error region.
Some mockups:
$ hledger check payees
Error at "/Users/simon/src/hledger/examples/sample.journal" (line 27):
Undeclared payee "income"
> 2018-01-01 income
assets:bank:checking $1
income:salary $-1
$ hledger check --strict
Error at "/home/cassou/Documents/configuration/ledger/2021.ledger" (line 12):
Undeclared account "debt"
2020-01-01
> debt 1
assets
$ hledger check
Error in transaction at "/home/cassou/Documents/configuration/ledger/2021.ledger" (lines 16-18):
Could not balance this transaction:
real postings' sum should be 0 but is: -1.00€
> 2021-01-01 Misc
> asset:account -11.00€
> expense:food 10.00€
We also have an cosmetic issue where some errors include text like "*** Exception:" or "ExitFailure 1" and others don't, depending on where they are handled.
More notes:
Emacs defines a bunch of regular expressions in compilation-error-regexp-alist-alist
to match many types of error messages. This could be a starting point.
Another way to report errors is to use several-line long messages as in clang. Your mockups above are a good step in that direction.
My main requirement would be to use the same error structure for all messages and to document it.
If you want to go fancy, I suggest using a parseable structured format such as JSON (e.g., JSON-RPC error format).
I also have the feeling that, for maximum compatibility with tools, the first line should contain the word "error" and the error's position. Does that sound correct ?
It doesn't matter to me much because all tools are different and require different error-handling support anyway.
We also want error messages to be readable on a standard 80-char terminal, so we usually have a challenge with long lines.
Not really important to me either.
Perhaps it's not necessary to mention the program name.
indeed
And of course eventually we'd like to show a pretty, coloured human-friendly rendering of the error region.
That would be nice to have. For my use-case, it's not important though because flycheck will highlight the problematic region of the buggy file: I only care about the position (start and end lines as well as the start and end columns) and the error message.
To me there are clearly 2 use cases:
Which brings me to my idea that there is something wrong with the way that hledger's checking functions are implemented. First, the check engine stops at the first error it encounters even though it is not blocking (e.g., an undeclared payee shouldn't prevent a transaction balance check on the next transaction). Another problem is that the code is harder to understand: instead of collecting validation results, the code throws exceptions which has an impact on the calling code. Finally, because each checking function is responsible for formatting error messages, the 2 use cases above can't be implemented at the same time.
I suggest reading replaceThrowWithNotification refactoring and applying it if possible. This will result in each checking function being responsible for returning (and not throwing) validation results. Then, the calling code is responsible for taking the decision regarding the validation results: e.g.,
hledger balance
could decide that the validation results shouldn't prevent the balance report from being calculated and displayed. It might even decide to show the validation results as warnings or don't show them at all;hledger check --output-format json
may format all validation results in a JSON object (e.g., with the JSON-RPC error format);hledger check
may format all validation results as a human-readable string.There is also the case that a user might want some validation results (e.g., undeclared accounts) to be considered errors (and thus prevent hledger balance
from computing any report) whereas other users might consider them warning or just ignore them.
We also have an cosmetic issue where some errors include text like "*** Exception:" or "ExitFailure 1" and others don't, depending on where they are handled.
My suggestion above would fix that by stopping the checking functions from being responsible for quitting the application.
best to include a brief error description on the first line as well, even if it makes lines wide ?
I'm not sure that is important and I would leave them on the next lines.
and/or include a unique searchable short error code ? Maybe our errors are simple enough this is overkill.
That would be a nice bonus. flycheck includes support for error IDs and links to error descriptions on the web as is used in shellcheck errors for example.
I don't see that as very important right now though.
also be compatible with LSP servers, eg for hledger-vscode
I don't know much about LSP but I guess that a compatible syntax can't hurt (either LSP or JSON-RPC or something else).
Thanks for the input. I agree that reporting all errors (where possible) rather than just one would be better. The check command is brand new, but we'll get there.
Probably the next action for this issue is to start gathering examples of all our current errors for review. [Starting points:
]
Here is a list of matchers I implemented for all errors I found:
(;; Used for an unbalanced transaction:
(error line-start "hledger: \"" (file-name) "\" (lines " line "-" end-line ")\n"
(message (zero-or-more line-start (zero-or-more not-newline) "\n")) "\n")
;; Used for invalid balance assertion:
(error line-start "hledger: balance assertion: \"" (file-name) "\" (line " line ", column " column ")\n"
"transaction:\n"
(message (zero-or-more line-start (zero-or-more not-newline) "\n")) "\n")
;; Used for invalid regular expression:
(error line-start "hledger: " (message "this regular expression" (zero-or-more not-newline)) "\n")
;; Used for an undeclared payee:
(error line-start "Error: " (message) "\n"
"at: \"" (file-name) "\" (lines " line "-" end-line ")\n")
;; Used for unordered dates:
(error line-start "Error: " (message) "\n"
"at \"" (file-name) "\" (lines " line "-" end-line "):\n")
;; Used for duplicate leaf names:
(error line-start "Error: " (message) "\n")
;; Used for an undeclared account:
(error line-start "hledger: " (message) "\n"
"in transaction at: \"" (file-name) "\" (lines " line "-" end-line ")\n")
;; Used for parse errors and invalid dates:
(error line-start "hledger: " (file-name) ":" line ":" column ":\n"
(message (zero-or-more line-start (zero-or-more not-newline) "\n")) "\n"))
Some explanation:
error
to indicate the level of the problem.error
starts the regular expression (expressed in the readable but unfamiliar syntax of rx).(message)
means "match all text to the end of the line and consider that the error message";(message REGEXP)
means "match REGEXP and consider that the error message";(file-name)
means "match all text to the end of the line and consider that the filename where the error appears";line
means "match a number and consider that the line where the error appears";end-line
means "match a number and consider that the last line where the error appears";column
means "match a number and consider that the column where the error appears".I added 2 more errors in my message above:
Please note that some error messages have no filename or line number.
Error formats changed a bit in hledger 1.25, and are still pretty inconsistent. I sent https://github.com/DamienCassou/flycheck-hledger/pull/10 to update hledger-flycheck.
Getting consistent high-quality errors and accurate flycheck region highlighting (not to mention LSP support) for all of our journal errors is a big project. But (once we converge on a standard format) it's quite crowd-sourceable, tackling each error separately, and any progress means immediate practical benefits.
https://github.com/simonmichael/hledger/tree/master/hledger/test/errors is a collection of test journals for reproducing journal errors, and documentation of current error message status.
is there a utility function that would take the line, column and message as argument and would generate an error message?
Error messages are now gathered at https://github.com/simonmichael/hledger/tree/master/hledger/test/errors#readme for overview and brainstorming a standard format.
is there a utility function
Currently just error'
and usageError
in Hledger.Utils and a few special purpose "error" functions here and there, I think.
Here's a proposal for a standard format similar to megaparsec's error output (see first three examples):
Error: [ID] FILE:LOCATION # begin with the word "Error", optional error id (in brackets ?),
# LOCATION is LINE[-ENDLINE][:COLUMN[-ENDCOLUMN]]
EXCERPT # short visual snippet whenever possible, with error highlighted,
# line numbers, optional colour; must be easy for flycheck to ignore
SUMMARY # one line summary of the problem
[DETAILS] # more details/advice when needed
In the past I tried to include the summary on the first line, but that can make things long/inconsistent.
Inspiration for error ids, if we do those: here are some of ShellCheck's, showing the id (two letter prefix and four digit number), summary (note: constant text, no data interpolated in these), and details page on the web (hosted in project's github wiki, named like the id, headed by the summary, containing standard subsections, editable by user/contributors at any time).
Currently, our error summaries try to provide specific contextual information, to be most clear and useful. But having unchanging summary text may also have advantages (more memorable, more suitable for docs, more standardised & easier error implementation..)
Also, here is a Rust compiler error:
error[E0532]: expected tuple struct or tuple variant, found unit variant `FarmAnimal::Cow`
--> $DIR/issue-84700.rs:15:9
|
| Cow,
| --- `FarmAnimal::Cow` defined here
...
| FarmAnimal::Cow(_) => "moo".to_string(),
| ^^^^^^^^^^^^^^^^^^ help: use this syntax instead: `FarmAnimal::Cow`
For more information about this error, try `rustc --explain E0532`.
Ie:
error[ID] SUMMARY # dynamic summary
LOCATION # starting line and column
EXCERPT # no line numbers, main error region highlighted, advice inserted in multiple locations
DETAILSCOMMAND # shows a 'rustc --explain ID' command to print generic explanation of this error type
Rust Compiler Error Index collects all the error explanations on a single web page. Rustc dev guide > Errors and Lints Rust RFC 1567 long error codes explanations
https://github.com/DamienCassou/flycheck-hledger/pull/13 is an update for hledger 1.26 and up.
https://github.com/simonmichael/hledger/pull/1885 is a continuation of this issue, improving hledger's error messages. Journal and timeclock related errors have been catalogued and improved. CSV related errors have been catalogued but not yet improved. CSV reading errors are a little more complex for tools to deal with as they can involve the CSV file or the CSV rules file(s).
I propose to close this issue, merge #1885, and start a new PR for improving CSV errors: #1886.
Thank you very much @simonmichael for the excellent work!
Right now,
hledger check
error messages are hard to parse which makes writing tools around hledger hard. Here is an example of an error output when an account is undeclared:and here is an error output when a transaction doesn't balance:
As you can see, the structures of the error messages are different, requiring different error parsers.