kjosib / booze-tools

Booze Tools will become the complete programming-language development workbench, all written in Python 3.9 (for now).
MIT License
14 stars 1 forks source link

Improve GOTO Table Compaction #4

Closed kjosib closed 5 years ago

kjosib commented 5 years ago

The present (row-equivalence-class) approach to compacting the GOTO table is functional but crude. Significant additional savings may be achieved in typical cases by a "lining-out" approach: looking alternately for columns, then rows, which have but a single remaining significant value; recording this value, and eventually being left with a far smaller and much less sparse "residue" matrix which inherently provides for both row- and column-equivalence classification.

The rows and columns would each be assigned a number; the smaller decides: if it's less than the size of the "lining-out" array, then the value at the corresponding position is the answer. Otherwise, subtract that size from both numbers to obtain coordinates for the residue matrix.

kjosib commented 5 years ago

On May 4th, 2019, the improved GOTO table codec was merged into the master branch, along with a command line flag for visualizing the result.

kjosib commented 5 years ago

After some more time playing with the resulting code: it's a lot tighter than the old encoding in almost every case, but when the grammar is complex enough to have a "residue matrix", it could still be stored a better way. Suppose you sort the columns by population. Then zeros (which are not significant) will tend to congregate at the right edge of the resulting matrix. So now you can find a way to pack the rows into a single vector and, instead of storing row-numbers, you store the offset of the "row" into this vector. This could save another 25% or more for the residue and also shave a few theoretical CPU cycles to probe the table, since you don't have to perform any indirection or multiplication on the row number.

kjosib commented 5 years ago

Code to implement the above has been checked in, but before closing the issue I want to add some stress tests because this module is getting a bit hairy.

kjosib commented 5 years ago

The Pascal example turns out to exercise this code nicely. It works great, or so it would seem.