Chevrotain / chevrotain

Parser Building Toolkit for JavaScript
https://chevrotain.io
Apache License 2.0
2.44k stars 200 forks source link

LL(*): Fix ECMA5 prediction and backtracking tests #1740

Closed msujew closed 2 years ago

msujew commented 2 years ago

Does three things:

  1. Adds syntactic predicates to the ECMA5 parsers to correctly work with ambiguities and LL(*)
  2. Fixes backtracking tests by adding a stricter heuristic for ambiguities
  3. Changes the order of predicate evaluation. Instead of evaluating them at the end like Antlr does, we evaluate them at the start and create a distinct DFA for each configuration of predicates.
bd82 commented 2 years ago

please rebase with master-LL-star it should then run the CI flow on github actions. When I ran them locally I had 5 test failures (instead of the six(?) previously)

bd82 commented 2 years ago

Running the performance benchmark versus 9.1.0 on this change gives me much smaller gains than previously seen

image

Now I am running the benchmark on a different machine / chrome version, so maybe this affects something... can you please re-test the performance versus 9.1.0 on your machine as well?

msujew commented 2 years ago

@bd82 Seeing smaller performance gains as well:

image

For reference, the current master-LL-star branch:

image

The only explanation I can think of here, is that the previous ECMA5 benchmark from master-LL-star is faulty in a way that does not lead to a parser error and only "by accident" picks the correct alternative, while also taking some sort of shortcut to the prediction, making it so much faster than the baseline. Especially since the other languages (JSON/CSS) are quite similar to the other PR.

It was quite weird anyway that the ECMA5 parser was that much faster than the baseline, since most decisions can be parsed in 2 or 3 tokens, so there shouldn't be that much of a gain using LL(*). So I guess that makes sense in some way? On the other hand, I'm really curious what kind of shortcut my bugs find that makes ECMA5 20% faster in total 🤔

Regarding the 5 tests failures: Yes, they were previously 6, the purpose of this PR is to fix the ambiguity detection in place of a stricter one (the one also used by Antlr) which fixes the previously failing backtracking test noticed in #1735.

bd82 commented 2 years ago

Feel free to merge PRs to the master-ll-star branch as it will likely be a while before I have the time to dive into LL(*) topic