This program will generate a look up graph for a set of words.
It will work out the search tree for the search of the input array the list of functions.
The algorithm for generating the tree is not optimised. The prune and tree generation loops could be joined together.
It also generates some code for searching the array.
PS: I know its a tree not a graph, but I thought that a graph was the answer, then realised it was a world of pain for little memory saving so we now have a tree.
buildgraph
I have seen it fail to build a valid table with less than 4 items and not handle look-up correctly with less than 8. I assume this is due to the masks used in the new compression step. If you are using this for less that 8 items, you are touched as it is a big overhead for that few items.
There is a further compression. After all the pruning has been done some of the leaf symbols are not used to the symbol table can be made smaller by checking if they are used within the symbol table lookup. If they are not used then remove them from the table and make the whole table smaller.
30 years of computing and I did not know that uppercase ASCII and lower case are different by one bit in the high nibble. I can now make this a lot quicker and the table smaller for case independent lookups by using this simple fact. So to optimise the whole thing remove the dup symbols and use the bit twiddle during the search. I assume this is what tolower() does, but it means only one pass over the string saving a loop each search.
(PS: hopefully I did know this once and have forgotten this fact.)