Open gh-andre opened 4 years ago
It should be possible to adapt the longest_prefix
method into a new one that return all the encountered prefixes instead of just the longest one.
It may take some time before I start working on it as I have been lacking the time recently. Any PR is welcome though.
@Tessil Thanks for responding. I am test-driving some other trie libraries and, depending on how testing goes, may revisit hat-trie later on. I quite like how the interface is structured and how it is performance-focused.
I am test-driving some other trie libraries
@gh-andre I also need the feature you propose. If you don't mind me asking, which one did you end up using which supports this feature?
@ecorm I ended up using Akamai Radix Tree. It allows searches, tree traversal and longest prefix searches. I did need to write a few wrappers around some of the components, but in general it worked out very well for me.
@gh-andre , @Tessil , submitted PR #52 to add this functionality.
@gh-andre BTW, I checked out Akamai Radix Tree and didn't like that I had to provide things like the desired max depth and the maximum edge length. Choosing a very large max depth is not feasible, because there's a reserve()
in there that would allocate a huge buffer needlessly for small sets of inputs.
@ecorm Thank you very much for the PR! I haven't had much time to implement this. I'll try to review the PR this weekend.
@ecorm Akamai Radix Tree may be a bit tricky indeed to work with. The maximum depth isn't a problem for many areas, such as maintaining IP addresses or region codes. It would present some challenge, as you describe, for arbitrary dictionaries.
It's been too long for me being away from this code to provide meaningful feedback for the changes, but it's good to know that this new functionality will be available in this library. Thank you for the heads-up and your work.
It would be quite useful if it was possible not just get an iterator to the longest prefix, but also a path leading to that prefix from the shortest prefix. Such a method would return a vector of iterators pointing to values that matched the first characters of the longest prefix argument. In other words, if a map contained these values:
, and the longest prefix would be requested as
/A/B/D/E/F
, then in a loop similar to the ones inequal_prefix_range_impl
andfilter_prefix
it would test if keys of nodes that have values would match the beginning of the prefix string, so in the example above, it would return a vector of iterators pointing to/A
and/A/B/D
.It would allow many useful operations for parent paths, which now can only be done via multiple calls with prefixes of different lengths.
A special "parent iterator" hiding the parent path vector and overloading
--
would probably fit nicely in the interface.Sorry if there is a way to achieve what I described and I missed it.