Complex conflicts handling if we need to peek more than 1 token.

DiscreteTom commented 1 year ago

E.g. when we parsing javascript:

f(({a, b}) => a + b);

with the following grammar rules:

exp := '(' '{' identifier (',' identifier)* '}' ')' '=>' exp
exp := '(' exp ')'
exp := object
object := '{' (object_entry (',' object_entry)*)? '}'
object_entry := identifier (':' exp)?

when we digest f(({ a we don't know whether the a is an object entry or an arrow function param. We have to peek maybe many tokens to judge that.

Solution for this issue:

Re-parse, see #19 . Not recommended.
Optimize grammar rules to prevent this to happen. Introducing more intermediate NT. Bad user experience.
Allow grammar rule to do more than reduce AST nodes. E.g. override parser buffer.

DiscreteTom commented 1 year ago

Maybe we can write an algorithm to check if a conflict can be safely resolved by re-parse.

For the above conflict, it can be safely resolved by re-parse without early accept.

DiscreteTom commented 9 months ago

Another idea: subset construction (子集构造法) to turn the NFA into DFA?

DiscreteTom / retsac

Complex conflicts handling if we need to peek more than 1 token. #24