runtimeverification / k

K Framework Tools 7.0
BSD 3-Clause "New" or "Revised" License
447 stars 147 forks source link

Investigate associative lists slow parsing #2099

Open ehildenb opened 3 years ago

ehildenb commented 3 years ago

Let's try to understand why we need to put so many parens around associative lists. This is annoying, and not doing it is a huge slowdown.

dwightguth commented 2 years ago

I'm going to pick up this task. It's probably going to have to involve a complete rewrite of the parser kernel used for inner parsing. I won't merge anything that's not at least as efficient as the current approach however.

radumereuta commented 2 years ago

To avoid the complete rewrite of the parser kernel, we could just avoid the problem in the grammar generation phase. Instead of sending syntax Exps ::= Exps "," Exp rewrite it to syntax Exps ::= Exp "," Exps and have a post processing step that rearranges the elements in the list.

I postponed this task though because the type inferencer influence on parsing time was far greater than the improvement to parsing here. Try to see if it has an actual benefit in doing this change.

dwightguth commented 2 years ago

It's greater in most cases, but the worst case time is still quite significant for some rules. The goal here isn't to make the overall time better in most cases, it's to guard against pathological rules taking a really long time without making the implementation any slower. It also will allow us to not have to put parentheses around really long associative lists, which is a trap that has ensnared people writing proofs in K many times.

I am not convinced that the solution you mention will really work with arbitrary context free grammars. I would prefer a solution that is not a hack, which means properly handling priority and associativity inside the kernel itself. Besides, the current parser is a truly awful implementation of Earley and it's not necessarily a bad thing to rewrite it. I doubt it would even take terribly long.

radumereuta commented 1 year ago

Some updates after the parser kernel rewrite:

$ kompile --version
K version:    v5.4.52-0-gfa56d11bcc

syntax Exps ::= Exps Int // fast
              | Int
$ seq 0 1 10     | kast - -v --output none // Total = 0.465s
$ seq 0 1 100    | kast - -v --output none // Total = 0.511s
$ seq 0 1 1000   | kast - -v --output none // Total = 0.654s
$ seq 0 1 10000  | kast - -v --output none // Total = 3.645s
$ seq 0 1 100000 | kast - -v --output none // OutOfMemoryError

syntax Exps ::= Int Exps // slow
              | Int
$ seq 0 1 10    | kast - -v --output none // Total = 0.462s
$ seq 0 1 100   | kast - -v --output none // Total = 0.503s
$ seq 0 1 1000  | kast - -v --output none // Total = 1.193s
$ seq 0 1 10000 | kast - -v --output none // OutOfMemoryError
$ seq 0 1 2000  | kast - -v --output none // Total = 2.321s
$ seq 0 1 3000  | kast - -v --output none // Total = 3.917s
$ seq 0 1 4000  | kast - -v --output none // OutOfMemoryError