opencog / link-grammar

The CMU Link Grammar natural language parser
GNU Lesser General Public License v2.1
389 stars 119 forks source link

Reduce the size of the string set #1488

Closed linas closed 7 months ago

linas commented 7 months ago

After recent fixes, it seems that the size of the string set can be reduced.

linas commented 7 months ago

@ampli if you think this is wrong, let me know. It seemed like the right thing to do.

ampli commented 7 months ago

The recent fixes were in tracon-set. When I last revised the string-set code, I benchmarked it using the English corpus batches and found that 3/8 is better than even 1/2. Since it doesn't take much memory, there is no need to save memory there. Also, if I succeed in my plan to significantly speed up the parsing, every CPU saving in the other parts of the library will have a bigger impact. (Regarding string-id, it only has very few items in all its current usages, so I assume its initial table size will not increase anyway. But even if it will have usages with many items - there is no need to save memory there.)