patrickhuber / Pliant

MIT License
26 stars 4 forks source link

Req: Multicore/multithread parsing #82

Open ArsenShnurkov opened 6 years ago

ArsenShnurkov commented 6 years ago

2004, Greg Sandstrom, A Parallel Extension of Earley’s Parsing Algorithm 2009, MIT, Parallelizing the CKY and Earley Parsing Algorithms "many general-purpose 16 processor machines exist and the number of processors is rapidly increasing" "the limitations the Earley algorithm places on parallelism allow parallel CKY to surpass it given enough processors" 2010, Aaron Dunlop, Nathan Bodenstab and Brian Roark, Reducing the grammar constant: an analysis of CYK parsing efficiency 2011, Mark Johnson, Parsing in Parallel on Multiple Cores and GPUs

patrickhuber commented 6 years ago

The algorithm seems pretty straight forward. There will need to be some thread safety injected in the chart object.

patrickhuber commented 3 years ago

A new paper was published on this topic LATE Ain’T Earley: A Faster Parallel Earley Parser

It appears to be an adaptation of CYK parallel parsing, but specifically for Earley.

This looks a little easier to implement and addresses my initial issues with parallelization. Namely, I had issues where states were out of sync. Normally predictions and completions need to be processed in a specific order, this paper sets up a dependency chain that allows that order to be preserved while parallelizing independent sequences of states.

It does bring up some duplication, but there are already uniqueness checks in place (mainly for performance).