Closed alexrp closed 12 years ago
Yeah, something that calculate the line and column, but only for failures. I cut away code for calculating this for all parse results, but doing it for failures is a good idea. OK, I can add that.
I pushed a commit with a position
function. Its documented in the code, not in the .md files.
I also changed the way parse trees are printed to give the failure location.
It's still primitive and inefficient, in that if when a leaves fails, all its ancestors in the tree will recalculate their position from the beginning. It should build upon its children failure.
The error reporting is actually looking pretty good. It's slightly off, though. Given a source like
use foo::bar
record foo {
}
if the grammar specifies rec
and not record
, Pegged will say expected bar, but got record foo
, but should say expected rec but got record ...
.
Should I try to make a reduced test case?
On Mon, Oct 8, 2012 at 1:22 PM, Alex Rønne Petersen < notifications@github.com> wrote:
The error reporting is actually looking pretty good. It's slightly off, though. Given a source like
use foo::bar
record foo { }
if the grammar specifies rec and not record, Pegged will say expected bar, but got record foo, but should say expected rec but got record ....
Should I try to make a reduced test case?
Yes please.
$ cat test2.d
module test;
import std.stdio;
import pegged.grammar;
mixin(grammar(`Test:
program <- decl* :eoi
useDecl < "use" qualifiedIdentifier
recDecl < "record" identifier "{" "}"
decl < useDecl/recDecl`));
void main()
{
writeln(Test(`use foo.bar
rec foo {
}`));
}
$ rdmd test2.d
Test (failure)
+-Test.program failure at line 2, col 18, after "... " expected foo.bar, but got "rec foo {
..."
+-Test.decl [0, 31]["use", "foo.bar"]
+-Test.useDecl [0, 31]["use", "foo.bar"]
It should say "expected record ..., but got rec" or something like that.
OK, I think I got it.
Btw, the error message is "expected end of input, but got rec foo" (because in this case, you have one valid decl and then a failure, so decl*
succeeds and it's :eoi
which fails.
To see the error message you wanted, use this (where I changed decl*
to decl decl
to force 2 decls in a row):
import std.stdio;
import pegged.grammar;
mixin(grammar(`Test:
program <- decl decl :eoi # Here, artificial change
useDecl < "use" qualifiedIdentifier
recDecl < "record" identifier "{" "}"
decl < useDecl/recDecl`));
void main()
{
writeln(Test(`use foo.bar
rec foo {
}`));
}
Looks good. One tiny suggestion for improvement (though I'm not sure if it's generic enough to be in Pegged): When printing the "but got ..." text, it may be worth stopping once a newline is hit but only if some text has already been printed; if not, just skip past all white space and rinse, repeat. Does that make sense?
Yes, that makes sense.
std.string.strip
seems to do the trick. I just added it and push it to Github.
Note that the error reporting is not perfect yet (it should say
expected "record" or "use"
). A nasty bug by Val (issue #60) forced
me to change the way rules are named and gave me the occasion to build
a real naming infrastructure. I think I can build on that to get a
real expected(rule)
function. That will take a few days.
Then, I can go back to unit-testing the hell out of pegged.peg and pegged.grammar. And then, I can finally code that left-recursion code I want to add.
Sounds good. Thanks!
(Closing this; I'll open a new issue for other possible error handling improvements.)
It would be nice to have a utility function to get the location of a fatal failure during parsing, at least as a temporary measure until we work out a proper error handling mechanism.
(This might exist already and I just can't find it...)