utk-se / WorldSyntaxTree

Language-agnostic parsing of World of Code repositories
Other
20 stars 0 forks source link

Hash-based keys for node text #14

Closed robobenklein closed 3 years ago

robobenklein commented 3 years ago

logN lookup time should be just fine for the WSTText indexes.

Using the SHA512 hash of the text content deals with the text uniqueness constraint and also is a much more clear differentiation compared to using a text index.

Since 512 bits in hex fits within the max key spec, I added the text length to the key as well, for the infinitesimally small chance we encounter the pigeonhole principle lol

robobenklein commented 3 years ago

nobody likes to review my PRs so oh well hope this is good