PAntoine / buildgraph

Silly parser lookup table generator
http://www.peterantoine.me.uk/buildgraph.html
2 stars 0 forks source link

Build lookup graph

This program will generate a look up graph for a set of words.

It will work out the search tree for the search of the input array the list of functions.

The algorithm for generating the tree is not optimised. The prune and tree generation loops could be joined together.

It also generates some code for searching the array.

PS: I know its a tree not a graph, but I thought that a graph was the answer, then realised it was a world of pain for little memory saving so we now have a tree.

Usage:

buildgraph [] [-q] [-i] [-u] [-p ] [-h ] -q Quiet. Only report errors. -h The directory to put the header file in (if different from the source). -i The look-up will ignore the case of the word input. -u The uncompressed table will be exported. -p The \"TST_\" for the enum name will be replaced with .

KNOWN ERRORS:

I have seen it fail to build a valid table with less than 4 items and not handle look-up correctly with less than 8. I assume this is due to the masks used in the new compression step. If you are using this for less that 8 items, you are touched as it is a big overhead for that few items.

Other Things to Do:

There is a further compression. After all the pruning has been done some of the leaf symbols are not used to the symbol table can be made smaller by checking if they are used within the symbol table lookup. If they are not used then remove them from the table and make the whole table smaller.

Ahhh.....

30 years of computing and I did not know that uppercase ASCII and lower case are different by one bit in the high nibble. I can now make this a lot quicker and the table smaller for case independent lookups by using this simple fact. So to optimise the whole thing remove the dup symbols and use the bit twiddle during the search. I assume this is what tolower() does, but it means only one pass over the string saving a loop each search.

(PS: hopefully I did know this once and have forgotten this fact.)