lezer-parser / lezer

Dev utils and issues for the Lezer core packages
33 stars 1 forks source link

Incorrect node nesting #19

Closed unconed closed 2 years ago

unconed commented 2 years ago

While writing a grammar for WGSL, I'm finding that lezer is producing incorrectly nested nodes on a valid parse. This is happening in the latest published versions:

Given the following (reduced) grammar:

@top StructBodyDeclaration { '{' StructMember* '}' }
@skip { space }

StructMember { AttributeList MemberDeclaration ";" }
AttributeList { "@"* }
MemberDeclaration { Identifier ":" Identifier }

@tokens {
  space { std.whitespace+ }
  Identifier { $[a-zA-Z_] $[0-9a-zA-Z] $[0-9a-zA-Z_]* | $[a-zA-Z] $[0-9a-zA-Z_]* }
}

When I parse:

{ intensity: type; }
{ @ intensity: type; }

I get the following ASTs:

(StructBodyDeclaration (AttributeList) (StructMember (MemberDeclaration (Identifier) (Identifier))))
(StructBodyDeclaration (StructMember (AttributeList) (MemberDeclaration (Identifier) (Identifier))))

If empty, AttributeList appears outside the associated StructMember. Otherwise it correctly goes inside StructMember. This seems like a bug?

marijnh commented 2 years ago

Thanks for finding that! Mutable state was being manipulated in the wrong order in the edge case of a reduction with no depth, causing an incorrect parse stack. Attached patch (released as @lezer/lr 0.15.8) should help.