dlang-community / Pegged

A Parsing Expression Grammar (PEG) module, using the D programming language.
534 stars 66 forks source link

Parsing Keywords #137

Closed pontifechs closed 10 years ago

pontifechs commented 10 years ago

So I saw the earlier issue (#10 I think) concerning this, and you posted this snippet to allow identifiers to start with keywords.

import std.stdio;
import pegged.grammar;

mixin(grammar(`
TEST:
 recordDeclaration < :"rec" identifier '{' '}'

 keywords <- "rec" / "foo" / "bar"
 identifier <~ !(keywords Spacing) [a-zA-Z_] [a-zA-Z_0-9]*
`));

void main()
{
    string input =  "rec record { }";
    writeln(TEST(input));
}

And currently this outputs:

TEST (failure)
 +-TEST.recordDeclaration (failure)
    +-TEST.identifier (failure)
       +-negLookahead!(and!(keywords, named)) failure at line 0, col 4, after "rec " expected anything but "rec", but got "record { }"

Has something changed about the way the negative lookahead works or something? I would've expected this to work.

PhilippeSigaud commented 10 years ago

You're right, that should work. I'll have a look.

PhilippeSigaud commented 10 years ago

OK, I got it. It's the Spacing rule which was changed: zero space is a valid parse for Spacing. Hence, in "record", "rec" we as recognized by keywords and then the (non-existent) space gave a valid parsing for (keywords Spacing). From there, !(...) failed.

I propose you just change Spacing to blank+ to have it fail when there is no space after a keyword.

This works:

import std.stdio;
import pegged.grammar;

mixin(grammar(`
TEST:
 recordDeclaration < :"rec" identifier '{' '}'

 keywords <- "rec" / "foo" / "bar"
 identifier <~ !(keywords blank+) [a-zA-Z_] [a-zA-Z_0-9]*
`));

void main()
{
    string input =  "rec record { }";
    writeln(TEST(input));
}
pontifechs commented 10 years ago

Thanks! That did the trick.