Closed Shorttail closed 6 years ago
Okay, I found out how to keep nodes, but it's a little bothersome if I have to add ^ to almost every grammar rule.
Decl < StorageClasses Decl
/ BasicType Declarators :";" / BasicType Declarator FunctionBody { derp }
It unfortunately applies derp to FunctionBody and not Decl. How do I apply it to Decl without touching all other possible Decl? What derp does it change the name to "function". Perhaps there's a better alternative? Adding an extra child to Decl with the name function? That too would require me to touch Decl and not just its children.
To apply derp
to the result of Decl
you can either use a special arrow:
<{ derp }
, or enclose your entire rule in parenthesis:
Decl <{derp}
StorageClasses Decl
/ BasicType Declarators :";"
/ BasicType Declarator FunctionBody
or
Decl < (StorageClasses Decl
/ BasicType Declarators :";"
/ BasicType Declarator FunctionBody) { derp }
Also, how do I kill off automatic trimming? For instance:
DeclDef < Declaration
Declaration < Decl
In this case, DeclDef is the name of the node, and both Declaration and Decl are skipped in the tree.
There should not be any trimming there. By default, all parse nodes from the called grammar are kept. Are you calling your grammar from another grammar?
With following rules:
Module < DeclDefs eoi
DeclDefs < DeclDef*
DeclDef < Declaration
Declaration < Decl
D [0, 25]["int", "main", "return", "0"] +-D.Module [0, 25]["int", "main", "return", "0"] +-D.DeclDefs [0, 25]["int", "main", "return", "0"] +-D.DeclDef [0, 25]["int", "main", "return", "0"] +-D.BasicType [0, 4]["int"] | +-D.BasicTypeX [0, 3]["int"] +-D.Declarator [4, 10]["main"] | +-D.Identifier [4, 8]["main"] +-D.FunctionBody [10, 25]["return", "0"] +-D.StatementList [13, 23]["return", "0"] +-D.Statement [13, 23]["return", "0"] snip
With following rules:
Module <^ DeclDefs eoi
DeclDefs <^ DeclDef*
DeclDef <^ Declaration
Declaration <^ Decl
D [0, 25]["int", "main", "return", "0"] +-D.Module [0, 25]["int", "main", "return", "0"] +-D.DeclDefs [0, 25]["int", "main", "return", "0"] +-D.DeclDef [0, 25]["int", "main", "return", "0"] +-D.Declaration [0, 25]["int", "main", "return", "0"] +-D.Decl [0, 25]["int", "main", "return", "0"] +-D.BasicType [0, 4]["int"] | +-D.BasicTypeX [0, 3]["int"] +-D.Declarator [4, 10]["main"] | +-D.Identifier [4, 8]["main"] +-D.BlockStatement [10, 25]["return", "0"] +-D.StatementList [13, 23]["return", "0"] +-D.Statement [13, 23]["return", "0"] snip
Clearly it has some effect to add the ^. The full D grammar has more rules than this, but I still see rules that contain only one other rule (no such in the full grammar) getting trimmed all the time.
I parse the grammar and dump it in a file. There are no other grammars present.
Also, the <{ derp } applies everything to decl, not to just one rule of decl. Are all decl functions? Either way I'd still like to mark nodes without having to do insane tree crawling post creation.
I'll look into your tree-trimming problem.
Also, the <{ derp } applies everything to decl, not to just one rule of decl. Are all decl functions? Either way I'd still like to mark nodes without having to do insane tree crawling post creation.
If you want to apply derp
only to one part of the alternative, then
enclose this part between parenthesis:
Rule <- Rule 1 / (Rule2 Rule3) {action} / Rule4
There action
will be applied on the result of (Rule2 Rule3)
, not on
Rule1
nor Rule4
.
I did some changed in the underlying engine for named rules. That seems to clean the issue for me. Can you pull the latest version and tell me if that's OK with you?
My output for arithmetic expression explodes now, which is great. Thanks.
Semantic actions I can't get to work though. Here's a snip of the output for parsing a function:
+-D.Declaration [0, 25]["int", "main", "return", "0"] +-D.Decl [0, 25]["int", "main", "return", "0"] +-D.BasicType [0, 4]["int"] | +-D.BasicTypeX [0, 3]["int"] +-D.Declarator [4, 10]["main"] | +-D.Identifier [4, 8]["main"] +-D.FunctionBody [10, 25]["return", "0"] +-D.BlockStatement [10, 25]["return", "0"] +-D.StatementList [13, 23]["return", "0"] +-D.Statement [13, 23]["return", "0"]
Now, snipping Decl:
Decl < BasicType Declarator FunctionBody {derp} < removes FunctionBody, its child appearing instead Decl < (BasicType Declarator FunctionBody) {derp} < removes Decl, its children taking its place
My semantic action looks like this:
T derp(T)(T t) { t.name = "lala"; return t; }
FunctionBody {derp} < removes BlockStatement, its child appearing instead Decl < (BasicType Declarator FunctionBody) {derp} < does nothing
I can reproduce it:
Test:
A <- (B C) { foo }
B <- 'b'
C <- 'c'
(B C) { foo }
is correctly parsed by the Pegged grammar, but there is no
action!(foo) in the generated code (whereas it works for B C { foo }
.
That's clearly a bug, thanks for finding it!
So as to track issues more clearly with commits and unit tests, could you open a new issue specifically for this, please?
OK, testing again on another computer, I get the correct input. Here is my test grammar:
Test:
A <- (D C B)
/ ((B C D) { foo })
B <- 'b'
C <- 'c'
D <- 'd'
So the DCB branch has no {foo}, whereas {foo} get invoked with BCD.
With the following foo:
T foo(T)(T t)
{
t.name = "Test.foo";
return t;
}
I get correct parse trees:
With "dcb":
Test [0, 3]["d", "c", "b"]
+-Test.A [0, 3]["d", "c", "b"]
+-Test.D [0, 1]["d"]
+-Test.C [1, 2]["c"]
+-Test.B [2, 3]["b"]
With "bcd":
Test [0, 3]["b", "c", "d"]
+-Test.A [0, 3]["b", "c", "d"]
+-Test.foo [0, 3]["b", "c", "d"]
+-Test.B [0, 1]["b"]
+-Test.C [1, 2]["c"]
+-Test.D [2, 3]["d"]
Where a Test.foo
node was correctly put in the final result.
Note that using t.name == "foo";
does not work. It does create a node with the name "foo", but since nodes external to the grammar are cut from the final output, unless explicitly prefixed by ^
, this node then disappears.
Can you test your own grammar with your derp
using t.name = "GrammarName.RuleName";
?
Hah, that works. D.lala is in the tree alright. I guess there's no bug then.
Hah, that works. D.lala is in the tree alright. I guess there's no bug then.
OK, cool.
Another way to get a specially named rule is simply to create such a rule. Here is what I mean: imagine you have:
Root <- Rule1 Rule2
/ Rule3 Rule4 Rule5 # Here I want to flag that
/ Rule5 Rule6 Rule7
Then a simple way to extract the more significant Rule3 Rule4 Rule5
is to
make it its own rule:
Root <- Rule1 Rule2
/ Flagged
/ Rule5 Rule6 Rule7
Flagged <- Rule3 Rule4 Rule5
I'm trying to type check my parse tree. It's based on a subset of the formal D grammar, and thus it can be very difficult to determine what something is by simply looking at a node's name. I'm trying out semantic actions, since they are applied per rule, and so I will have an easy time determining which rule was taken. Right now I'm trying to determine if a Decl is a function:
Decl < StorageClasses Decl / BasicType Declarators :";" / BasicType Declarator FunctionBody { derp }
It unfortunately applies derp to FunctionBody and not Decl. How do I apply it to Decl without touching all other possible Decl? What derp does it change the name to "function". Perhaps there's a better alternative? Adding an extra child to Decl with the name function? That too would require me to touch Decl and not just its children.
I can do all of this once the tree has been generated, but it seems like a more obvious solution to use semantic actions, since they are part of the grammar.
Also, how do I kill off automatic trimming? For instance:
DeclDef < Declaration
Declaration < Decl
In this case, DeclDef is the name of the node, and both Declaration and Decl are skipped in the tree.