lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.75k stars 401 forks source link

Earley now uses OrderedSet for better output stability #1327

Closed erezsh closed 11 months ago

erezsh commented 1 year ago

Addresses issue #1325, and others

erezsh commented 1 year ago

@MegaIng @chanicpanic This change makes Earley approx 10% slower. (on a very simple benchmark)

Do you think the stability it provides is worth it? (let's assume it gives us 100% output stability)

MegaIng commented 1 year ago

Maybe have an option that can change it, but default to the slower, stable version?

Also, I didn't check that this actually hits all locations, I assume you just searched for references to set.

chanicpanic commented 1 year ago

I like the idea of having an option to enable/disable it.

Defaulting to the stable version would be more user-friendly. The slowdown is unfortunate, but it is consistent with Lark positioning Earley as the parser for power and LALR as the parser for speed.

On the other hand, in addition to performance, it could be argued that defaulting to the unstable version has the benefit of exposing undesirable ambiguity and forcing grammar authors to remove/resolve it.

codecov-commenter commented 11 months ago

Codecov Report

Patch coverage: 92.98% and project coverage change: +0.02% :tada:

Comparison is base (c2e44ea) 89.23% compared to head (41c28d4) 89.26%.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1327 +/- ## ========================================== + Coverage 89.23% 89.26% +0.02% ========================================== Files 51 51 Lines 7591 7612 +21 ========================================== + Hits 6774 6795 +21 Misses 817 817 ``` | [Flag](https://app.codecov.io/gh/lark-parser/lark/pull/1327/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lark-parser) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/lark-parser/lark/pull/1327/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lark-parser) | `89.26% <92.98%> (+0.02%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lark-parser#carryforward-flags-in-the-pull-request-comment) to find out more. | [Files Changed](https://app.codecov.io/gh/lark-parser/lark/pull/1327?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lark-parser) | Coverage Δ | | |---|---|---| | [lark/load\_grammar.py](https://app.codecov.io/gh/lark-parser/lark/pull/1327?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lark-parser#diff-bGFyay9sb2FkX2dyYW1tYXIucHk=) | `92.60% <50.00%> (ø)` | | | [lark/parsers/earley\_forest.py](https://app.codecov.io/gh/lark-parser/lark/pull/1327?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lark-parser#diff-bGFyay9wYXJzZXJzL2VhcmxleV9mb3Jlc3QucHk=) | `77.94% <87.50%> (+0.19%)` | :arrow_up: | | [lark/utils.py](https://app.codecov.io/gh/lark-parser/lark/pull/1327?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lark-parser#diff-bGFyay91dGlscy5weQ==) | `86.66% <93.75%> (+0.51%)` | :arrow_up: | | [lark/parsers/earley.py](https://app.codecov.io/gh/lark-parser/lark/pull/1327?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lark-parser#diff-bGFyay9wYXJzZXJzL2VhcmxleS5weQ==) | `82.65% <94.73%> (+0.57%)` | :arrow_up: | | [lark/lark.py](https://app.codecov.io/gh/lark-parser/lark/pull/1327?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lark-parser#diff-bGFyay9sYXJrLnB5) | `85.66% <100.00%> (+0.04%)` | :arrow_up: | | [lark/parser\_frontends.py](https://app.codecov.io/gh/lark-parser/lark/pull/1327?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lark-parser#diff-bGFyay9wYXJzZXJfZnJvbnRlbmRzLnB5) | `95.90% <100.00%> (ø)` | | | [lark/parsers/xearley.py](https://app.codecov.io/gh/lark-parser/lark/pull/1327?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lark-parser#diff-bGFyay9wYXJzZXJzL3hlYXJsZXkucHk=) | `100.00% <100.00%> (ø)` | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

erezsh commented 11 months ago

@MegaIng @chanicpanic take 2