dlang-community / Pegged

A Parsing Expression Grammar (PEG) module, using the D programming language.
534 stars 66 forks source link

Left-recursion improvements. #226

Closed veelo closed 7 years ago

veelo commented 7 years ago

This brings the parsing of a sample Extended Pascal file down from 13 minutes to 17 seconds. The mistake was not to assign the result of std.algorithm.mutation.remove, which caused an array to grow to a massive size and memoization to never be reenabled. (This is now on row 562.)

It also removes quite a bit of complexity and some requirements that I never really understood why were necessary. Now it appears they aren't.

This is worth a patch-level version bump, I think.

PhilippeSigaud commented 7 years ago

OK, done. I merged it and created a new patch release (v0.4.2). I suppose the speed improvements will also be visible for other grammars than EP?

On Sat, Apr 15, 2017 at 2:52 AM, Bastiaan Veelo notifications@github.com wrote:

This brings the parsing of a sample Extended Pascal file down from 13 minutes to 17 seconds. The mistake was not to assign the result of std.algorithm.mutation.remove, which caused an array to grow to a massive size and memoization to never be reenabled. (This is now on row 562.)

It also removes quite a bit of complexity and some requirements that I never really understood why were necessary. Now it appears they aren't.

This is worth a patch-level version bump, I think.

You can view, comment on, or merge this pull request online at:

https://github.com/PhilippeSigaud/Pegged/pull/226 Commit Summary

  • Fix blocking of memoization. The error was in the use of std.algorithm.mutation.remove, which confusingly does not mutate the range, but returns a new range — which was never used.
  • Replace “keys.canFind” with “in”.
  • Comments about stopper rules and blocked memoization.
  • Use the same memo-blocking array for all left-recursive rules.
  • Prevent duplicates.

File Changes

Patch Links:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/PhilippeSigaud/Pegged/pull/226, or mute the thread https://github.com/notifications/unsubscribe-auth/AAjYpz1WwqovYUmbWKLxEp9aubqyXdGvks5rwBTCgaJpZM4M-NFj .

veelo commented 7 years ago

Thanks.

I suppose the speed improvements will also be visible for other grammars than EP?

If the grammar is left-recursive then yes.

Although much better, 17 seconds is still really slow. This is a file that is compiled by the Prospero compiler in just 99 milliseconds... I am thinking that top-down parsing is not efficient for this language, but I'll see if I can do a bit more profiling.