dlang-community / Pegged

A Parsing Expression Grammar (PEG) module, using the D programming language.
534 stars 66 forks source link

Keeping node result in wrong tree struct #203

Open Domain opened 8 years ago

Domain commented 8 years ago

Keeping node result in wrong tree struct:

while(true){continue;break;}

Zero:

Program < Statement* :eoi
Statement < BlockStmt / WhileStmt / ^ContinueStmt / ^BreakStmt
BlockStmt < :"{" Statement* :"}"
WhileStmt < :"while" :"(" Expr :")" Statement
ContinueStmt < :"continue" :";"
BreakStmt < :"break" :";"
Expr < "true"
Zero [0, 29]["true"]
 +-Zero.Program [0, 29]["true"]
    +-Zero.Statement [0, 29]["true"]
       +-Zero.WhileStmt [0, 29]["true"]
          +-Zero.Expr [6, 10]["true"]
          +-Zero.ContinueStmt [12, 21][]
          |  +-and!(discard, discard) [12, 21][]
          +-Zero.BreakStmt [21, 27][]
             +-and!(discard, discard) [21, 27][]

What I want is:

Zero:

Program < Statement* :eoi
Statement < BlockStmt / WhileStmt / ContinueStmt / BreakStmt
BlockStmt < :"{" Statement* :"}"
WhileStmt < :"while" :"(" Expr :")" Statement
ContinueStmt < "continue" :";"
BreakStmt < "break" :";"
Expr < "true"
Zero [0, 29]["true", "continue", "break"]
 +-Zero.Program [0, 29]["true", "continue", "break"]
    +-Zero.Statement [0, 29]["true", "continue", "break"]
       +-Zero.WhileStmt [0, 29]["true", "continue", "break"]
          +-Zero.Expr [6, 10]["true"]
          +-Zero.Statement [11, 29]["continue", "break"]
             +-Zero.BlockStmt [11, 29]["continue", "break"]
                +-Zero.Statement [12, 21]["continue"]
                |  +-Zero.ContinueStmt [12, 21]["continue"]
                +-Zero.Statement [21, 27]["break"]
                   +-Zero.BreakStmt [21, 27]["break"]

Note that WhileStmt should have 2 children. See also #200

veelo commented 8 years ago

To clarify: In the first example, BlockStmt and its Statement parent and children are dropped, which is not what you'd expect. The kept nodes appear under the wrong parent. From the first example I would expect:

Zero [0, 29]["true"]
 +-Zero.Program [0, 29]["true"]
    +-Zero.Statement [0, 29]["true"]
       +-Zero.WhileStmt [0, 29]["true"]
          +-Zero.Expr [6, 10]["true"]
          +-Zero.Statement [11, 29][]
             +-Zero.BlockStmt [11, 29][]
                +-Zero.Statement [12, 21][]
                |  +-Zero.ContinueStmt [12, 21][]
                |     +-and!(discard, discard) [12, 21][]
                +-Zero.Statement [21, 27][]
                   +-Zero.BreakStmt [21, 27][]
                      +-and!(discard, discard) [21, 27][]

On a side note: Forcing these nodes to be kept produces a lot of noise (apparently all child nodes resulting from built-in rules in a kept node are kept as well, which I think is undocumented):

import std.stdio;
import pegged.grammar;

mixin(grammar(`
    Zero:

    Program         < Statement* :eoi
    Statement       < ^BlockStmt / WhileStmt / ^ContinueStmt / ^BreakStmt
    BlockStmt       < :"{" Statement* :"}"
    WhileStmt       < :"while" :"(" Expr :")" ^Statement
    ContinueStmt    < :"continue" :";"
    BreakStmt       < :"break" :";"
    Expr            < "true"
`));

void main()
{
    writeln(Zero(`while(true){continue;break;}`));
/*
Zero [0, 28]["true"]
 +-Zero.Program [0, 28]["true"]
    +-Zero.Statement [0, 28]["true"]
       +-Zero.WhileStmt [0, 28]["true"]
          +-Zero.Expr [6, 10]["true"]
          +-Zero.Statement [11, 28][]
             +-or!(keep!(wrapAround!(spacing, Zero.BlockStmt, spacing)), wrapAround!(spacing, Zero.WhileStmt, spacing), keep!(wrapAround!(spacing, Zero.ContinueStmt, spacing)), keep!(wrapAround!(spacing, Zero.BreakStmt, spacing))) [11, 28][]
                +-keep!(wrapAround!(spacing, Zero.BlockStmt, spacing)) [11, 28][]
                   +-Zero.BlockStmt [11, 28][]
                      +-and!(discard, zeroOrMore, discard) [11, 28][]
                         +-zeroOrMore!(wrapAround!(spacing, Zero.Statement, spacing)) [12, 27][]
                            +-Zero.Statement [12, 21][]
                            |  +-or!(keep!(wrapAround!(spacing, Zero.BlockStmt, spacing)), wrapAround!(spacing, Zero.WhileStmt, spacing), keep!(wrapAround!(spacing, Zero.ContinueStmt, spacing)), keep!(wrapAround!(spacing, Zero.BreakStmt, spacing))) [12, 21][]
                            |     +-keep!(wrapAround!(spacing, Zero.ContinueStmt, spacing)) [12, 21][]
                            |        +-Zero.ContinueStmt [12, 21][]
                            |           +-and!(discard, discard) [12, 21][]
                            +-Zero.Statement [21, 27][]
                               +-or!(keep!(wrapAround!(spacing, Zero.BlockStmt, spacing)), wrapAround!(spacing, Zero.WhileStmt, spacing), keep!(wrapAround!(spacing, Zero.ContinueStmt, spacing)), keep!(wrapAround!(spacing, Zero.BreakStmt, spacing))) [21, 27][]
                                  +-keep!(wrapAround!(spacing, Zero.BreakStmt, spacing)) [21, 27][]
                                     +-Zero.BreakStmt [21, 27][]
                                        +-and!(discard, discard) [21, 27][]
*/
}

NB: The OP seems to have found a grammar that does what he wants (second example).