Closed AlectronikForge closed 3 years ago
Debugging parsers is tricky, but Npeg can help you out a bit with this: if you compile your code with -d:npegTrace
, NPeg will dump diagnostic traces of theparser running which shows you what it is doing. In your case this will show this:
0| 0| 0|var hello; |program |choice 3 |
1| 0| 0|var hello; |program | call statement:4 |*
4| 0| 0|var hello; |statement |jump var_decl:6 |*
6| 0| 0|var hello; |varC |chr "v" |*
7| 0| 1|ar hello; |varC |chr "a" |*
8| 0| 2|r hello; |varC |chr "r" |*
9| 0| 3| hello; |Alpha |set {'A'..'Z','a'..'z'} |*
15| 0| 3| hello; | |opFail (backtrack) |*
3| 0| 0|var hello; |program |return |
The 4th column shows you a slice of the subject string currently getting parsed, and to the right of that the state of the parser and what it is trying to do.
Here you can see that your parser is able to parse the var
string, but then runs into an Error when trying to match Alpha
, but instead has a space as the next character. Looking at your grammar this makes sense, as it says that after the var
string, it should expect one or more Alpha
characters.
The solution is to make the white space explict in the grammer. This is inherent to PEGs, which do not have a separate tokenization stage, so they will handle white space just like any other character in your subject.
This is your fixed grammar:
let parser = peg("program", d: Dict):
program <- *statement
space <- +" "
statement <- var_decl
word <- +Alpha
varC <- "var"
semi <- ';'
var_decl <- varC * space * >word * semi
it adds a target rule space
that matches one or more space
characters, and inserts this rule between varC
and >word
.
Let me know if this solves your problem!
Thanks! Indeed this was the problem, quite a beginner one as I had the same problem (forgetting whitespace) before but forget again about it because e.g. ANTLR and textX have special rules to 'forget' about whitespace characters and comments.
Maybe it could be a feature to add such a feature? Unfortunately I'm too new to nim to be a candidate yet to write such one but maybe later..
Thanks! The debugging feature indeed is very helpful.
Yeah, I did some experiments with ignoring white space in the past, but I was not really happy with some of the subtle effects it brings. Also it there is a run time cost to it because for every possible match the parser also has to check for whitespace matching, which I didn't find worth the gains at that time.
I'll keep this in mind though for future improvements!
Hi,
I found npeg just now and tried it out with the following example.
`import npeg, strutils, tables
type Dict = Table[string, int]
let parser = peg("program", d: Dict): program <- statement statement <- var_decl word <- +Alpha varC <- "var" semi <- ';' var_decl <- varC >word * semi
var words: Table[string, int] discard parser.match("var hello;", words).ok echo words`
But it doesn't match - what am I doing wrong? Thanks for any hint!