Closed kfsone closed 2 years ago
I think we can't live without 'noise' with PEG. Other PEG libraries provide similar 'white space' handling feature, but I don't think any of those provide the perfect solution. So I am not pursuing this, but a pull request is always welcome!
Could you make a very subtle change to how
%whitespace
is handled so that instead of looking for it like a regular token, you instead only test it against unmatched tokens (outside <>).Problem: Most parser generators are either (1) totally ignore whitespace, so you can't easily require it, (2) clutter your definition with all the places whitespace is allowed.
(2) is really awful when you have a grammar that has levels of whitespace: space/tab between terminals, space/tab/comment/newline between "statements".
Given:
this grammar is complicated by 'LineComment', because we probably don't want to allow
(the grammar I wrote has a problem because Action's "Spacing*" and IfThen's "Newline" will conflict)
Peglib's own .peg grammar is hard to read because there is so much of it focused on whitespace allowances.
Making this change to %whitespace would also allow this grammar to work, since you're not looking for %whitespace, you're just not presenting an error when you do encounter it outside <>s.
(I should probably have used Space+ for efficiency)
It also matches "hello \t\t\t 1", but it would require at least one space/tab between 'hello' and '1'.
This makes writing a "newline" rule much simpler, e.g for pseudo-go where we have to allow for nestable one-line block comments as well as whitespaces, since we don't have to worry about Spacing in the grammar, but if there are scenarios where we require an explicit space, we'd still be able to check for them.