dlang-community / Pegged

A Parsing Expression Grammar (PEG) module, using the D programming language.
534 stars 66 forks source link

Dgrammar space additions #127

Closed Shorttail closed 10 years ago

Shorttail commented 10 years ago

Having tinkered with my own grammar for D, I came to realize that "Spacing" is a magical construct that overrides whatever Pegged use already for spacing. Copying the one in the example grammar made all my testcases fail. After adding a few extra symbols from the D language specification, it worked. Specifically:

Space <- "\u0020" / "\u0009" / "\u000b" / "\u000c" / eol

Those are vertical space, tab, horizontal tab and form feed, whereas in the example grammar Space was simply space. I realize now that it's likely because the writer uses spaces exclusively.

PhilippeSigaud commented 10 years ago

On Wed, Dec 25, 2013 at 2:55 PM, Casper Faergemand <notifications@github.com

wrote:

Having tinkered with my own grammar for D, I came to realize that "Spacing" is a magical construct that overrides whatever Pegged use already for spacing.

Spacing is explained here:

https://github.com/PhilippeSigaud/Pegged/wiki/Extended-PEG-Syntax#space-arrow-and-user-defined-spacing

Copying the one in the example grammar made all my testcases fail. After adding a few extra symbols from the D language specification, it worked. Specifically:

Space <- "\u0020" / "\u0009" / "\u000b" / "\u000c" / eol

Those are vertical space, tab, horizontal tab and form feed, whereas in the example grammar Space was simply space. I realize now that it's likely because the writer uses spaces exclusively.

Hmm, I thought the inner spacing parser already munched tabs and such. I'll add them, thanks for the head up.

PhilippeSigaud commented 10 years ago

Looking at pegged/peg.d: it already contains space, tab, vertical tab and all end of line chars (\r, \n...). AFAICT, Pegged already does what you want, no?

Shorttail commented 10 years ago

This works: Spacing <- (Space / Comment)* Space <- "\u0020" / "\u0009" / "\u000b" / "\u000c" / eol

This fails all my testcases: Spacing <- (space / Comment)*

PhilippeSigaud commented 10 years ago

On Thu, Dec 26, 2013 at 10:15 PM, Casper Faergemand < notifications@github.com> wrote:

Spacing <- (Space / Comment)*

Space <- "\u0020" / "\u0009" / "\u000b" / "\u000c" / eol

This works.

Spacing <- (space / Comment)*

This fails all my testcases.

And with the built-in spacing instead?

Shorttail commented 10 years ago

Spacing <- (spacing / Comment)* 23/66 passed. It's all the ones with comments.

Spacing <- (Comment / spacing)* All passing.

Well, lesson learned. Your Dgrammar uses space before Comment, though I haven't tested if that is a problem. I'm guessing something inside spacing eats up bits and pieces of the comments.