palle-k / Covfefe

A parser for nondeterministic context free languages
https://palle-k.github.io/Covfefe/
MIT License
62 stars 8 forks source link

Fix Earley parser stack overflow issue #12

Closed mikolasstuchlik closed 2 years ago

mikolasstuchlik commented 3 years ago

This pull request solves partially the https://github.com/palle-k/Covfefe/issues/11 issue. I can not close the issue outright, becaue CYK parser has still stack overflow issue present.

The new algorythm might be slower than the original recursive approach, but is not constrained by the length of the parsed sentence.

Note: tests now take few minutes to complete.

mikolasstuchlik commented 3 years ago

Turns out, the parser stack-overflows in another function.

Snímek obrazovky 2021-11-04 v 14 02 07

The test testErleyStackOverflow was modified to use 10000 long word and failed after 46 minutes 15 seconds of runtime.

mikolasstuchlik commented 3 years ago

I have been able to solve the explode recursion. I have also profiled the program and found out, that the major bottleneck was in linear iteration through a Set instance. I have changed the Set to Array, reducing the run time for 10000 items test from 46 minutes down to < 4 minutes.

I would like to request a review. I have decided not to add an inline documentation. Let me know what pars of the code you would like to see documented. All the machines have the same architecture.

mikolasstuchlik commented 3 years ago

Unfortunately, I have observed another issue related to recursion: all other recursive functions :/

When working on Linux, the syntaxTree.description.count was crashing due to map function stack overflow. I'm not going to solve the issue yet from two reasons:

Since I am now able to prototype in a reasonable time, I don't need to address the issue right now. If the PR is accepted, maybe I'll address it in the future.

mikolasstuchlik commented 2 years ago

@palle-k Hi, can we merge this? :)