kjosib / booze-tools

Booze Tools will become the complete programming-language development workbench, all written in Python 3.9 (for now).
MIT License
14 stars 1 forks source link

Improve Parse Table Compaction #44

Open kjosib opened 1 year ago

kjosib commented 1 year ago

See here for deets.

First, it's probably a good idea to split the system into table creation and table compaction..

Reduce-only states need not exist in the ACTION table. Rather, they can be implied by the space after the table. After a shift, check if the state-number >= the height of the table. If so, subtract that height to obtain a production-rule number and reduce immediately. This effectively creates a new kind of table-entry: shift-and-reduce-by-rule-P -- which is just fine, because it means the interactive_step method disappears from the parse-table interface. (If nondeterminism is in play, there may be an interaction to work out.) These new shift-and-reduce instructions can also appear in the GOTO table, so the bounds-check needs to be in a loop.

Do not apply the default-reduction technique (yet). Instead:

Some states reachable by non-terminals contain "don't-care" entries instead of "error" entries. Specifically, if the state can be reached by following a "shift-and-reduce" step, then it needs its error entries. Otherwise, it was reached with look-ahead and so contains don't-care entries. This criterion is tricky but may be calculated thus to improve later techniques.

Compute an error plane (SIGMAP) using graph-coloring along both rows and columns. You need this anyway for the immediate error-detection property which LR(1) promises, but this also means that you no longer need the check vector for the row-displacement scheme. (It's implied in the error plane.) You can probably shrink the error-plane check-vector by numbering the equivalence classes to put most of the non-error items in one corner, and then golfing a rectangular (triangular?) zone where the sense of the check inverts.

Now you can treat the ACTION table as mostly don't-care entries. Row-wise GCS will probably condense it significantly. This may be even more true if you CRS the reductions first. The result may still be sparse for most rows, but RDS with first-fit-decreasing will probably do just fine on the residual. We can then consider the row-displacement as equivalent to an equivalence-class, and recall there's no need for a check-vector since we already checked SIGMAP.

We may see better RDS behavior if we also permute the columns or apply columnar GCS. Or, it may be worth partitioning the rows into raw rows and columnar-GCS rows. Columnar GCS will work much better if CRS is applied to the affected portion of the matrix. In that case, it would also take the place of LES.

We can save a word per state if the GOTO table participates in the same row-wise GCS as the ACTION table. This is unlikely to make trouble, especially if columnar GCS gets involved.