Followup to #141, simplifying lazy reference resolution. The key insight here is that either:
Every lazy reference can be resolved to an Expression just from the original grammar definition, or
There is a circular reference chain of lazy refs in the original definition.
What was happening prior to #141, is that lazy references were only being resolved one level deep, which (depending on ordering) stuck more lazy references in child nodes, which then need to be fixed again. So e.g. in Erik's test case reproducing the issue, it contains the lines:
_ = meaninglessness*
meaninglessness = whitespace
Since the are resolved in order, the first rule resolves to _ = <LazyReference to whitespace>*, then the second rule resolves correctly, but it's already too late.
The fix in #141 was to basically check the entire grammar every time a new lazy reference was resolved, which works, but is inefficient and can be simplified. This PR just follows each lazy reference down the chain of references whenever we need to resolve it.
Note
The goal in invalidating grammars with circular reference chains isn't to invalidate all circular references (though that's maybe a good idea if it's possible), just to those whose lazy references cannot be resolved. So e.g. the following grammar is circular, but its lazy references can all be resolved down to concrete expressions (Sequences whose members reference each other):
foo = bar ""
bar = "" foo
However, the lazy references can never be resolved in a grammar like the following:
Followup to #141, simplifying lazy reference resolution. The key insight here is that either:
Expression
just from the original grammar definition, orWhat was happening prior to #141, is that lazy references were only being resolved one level deep, which (depending on ordering) stuck more lazy references in child nodes, which then need to be fixed again. So e.g. in Erik's test case reproducing the issue, it contains the lines:
Since the are resolved in order, the first rule resolves to
_ = <LazyReference to whitespace>*
, then the second rule resolves correctly, but it's already too late.The fix in #141 was to basically check the entire grammar every time a new lazy reference was resolved, which works, but is inefficient and can be simplified. This PR just follows each lazy reference down the chain of references whenever we need to resolve it.
Note
The goal in invalidating grammars with circular reference chains isn't to invalidate all circular references (though that's maybe a good idea if it's possible), just to those whose lazy references cannot be resolved. So e.g. the following grammar is circular, but its lazy references can all be resolved down to concrete expressions (Sequences whose members reference each other):
However, the lazy references can never be resolved in a grammar like the following:
There it's lazy refs all the way down.
cc @righthandabacus