Open PallHaraldsson opened 2 years ago
It depends on what operation we need for a string dictionaries. As their paper mentioned, the goal of the xcdat implementation is to support both string-to-unique-index and index-to-string lookup with acceptable memory-speed trade-off, and you also get those operation supported by a Trie (like the common prefix). FWIW it is already faster than the normal DataStructures.Trie with small dataset.
As for string dictionaries in Julia Base/Compiler, I'm not sure what we would need there. If the dataset is really small, DAT would probably allocate more than the Base.Dict
and a single hash could still be faster. Another issue I can think of is this implementation does not support insertion/deletion, so you need the full dataset to construct a DAT and the use case might be limited.
This seems very intriguing, but would it also be helpful for small[er] string dictionaries (e.g. for Julia Base and the Julia compiler)?