mna / pigeon

Command pigeon generates parsers in Go from a PEG grammar.
BSD 3-Clause "New" or "Revised" License
835 stars 66 forks source link

Build failure only when --optimize-grammar is set; grammar with not referenced rule #70

Open breml opened 6 years ago

breml commented 6 years ago

The following grammar results in build failure, if generated with --optimize-grammar. The same grammar builds with out problem, if the option is not set.

Minimal example:

X ← z:Z
Y ← Z
Z ← ('Z' { return nil, nil })?

Build error:

(*parser).callonZ3 undefined (type *parser has no method callonZ3)
mna commented 5 years ago

Hello Lucas,

I ran into the same issue and created this repo with what I had as minimal reproducible grammar (yours is smaller, though): https://github.com/mna/pigeon-bug-optimize-grammar-2. I tested it with your fix in the pending PR and it does work, is there any side-effect preventing the fix from being merged?

Thanks, Martin

breml commented 5 years ago

Hi @mna If I remember correctly, the reason why I did not yet merge this PR is, that it only solves part of the problem. I had cases where the optimize still failed. That being said, we can merge this one and at least improve the situation, even though there are still failing (edge) cases.

mna commented 5 years ago

Ah, yeah indeed I tried it on my full grammar and it does fail elsewhere with this fix. Might be worth holding up, as it might break grammars that are otherwise working today (for those that didn't run into the issue that it solves). I'll try to find some time to investigate further next week.

Thanks, Martin

breml commented 5 years ago

I found the old commit and my notes about the bigger problem I found when I wrote the PR #71.

The thing is, that the added search algorithm in ast_optimize.go is not able to find unnecessary rules, if these rules do form a cycle, but the entire rules cycle is no longer reachable from one of the entry points. Such cycles can be produced by the grammar optimizer. I found such a case in the go-thrift grammer when I tried to use the optimize option. I reduced the grammar to the following minimal example, extracted out of the optimized grammar from go-thrift:

{
package issue70_cycle
}

TypeDef ← annotations:TypeAnnotations?

FieldType ← SubType

SubType <- typ:FieldType annotations:TypeAnnotations? 

TypeAnnotations ← annotations:TypeAnnotation*

TypeAnnotation ← value:(value:'X' { return value, nil })?

So the bottom line is, that the search algorithm needs to be extended with something like this:

breml commented 5 years ago

Optimizer also needs to detect that rules can be dropped, if they are only referred to by itself (closed cycles), see comment.