Open Albibek opened 8 years ago
This is a good idea. We just need to think of a way to do it simply and efficiently. One option would be to encode each portion of the key as bytes and store something like a Vec<Vec<u8>>
on each "edge" of the trie (the TrieNode.key
field at the moment). We would have to deal with keys getting cut in half under this scheme, as its quite natural to want a trie keyed by something like Vec<String>
. For example, for URL routing, you could encode a path like "/download/file1.txt"
as vec!["download", "file1.txt"]
, which would split a key if "/download/file2.txt"
were already in the trie.
Another way I've thought about implementing this is to define a separator byte that can be used to delineate the different elements of the vector. For filesystem and URL paths, the byte encoding of /
could be used, as it's guaranteed not to appear in paths.
Well, my thought is that is more of a user's concern to decide the split point of his/her data. Wouldn't it be more natural and useful to ask a library user to provide data already split as Vec or, if possible as an Iterable object(it is possible btw?).
In the example of Vec
Maybe something like the sequence trie API? https://docs.rs/sequence_trie/0.2.1/sequence_trie/
That depends, if the radix trie itself is compatible with iterable structures. The other question is effectiveness of iteration-only structure and it's compatibility with different optimizations done by llvm, like vectorizations etc.
The idea: The most natural thing for the prefix implementation is
Vec<T>
. The Trie is currently implemented through serialization of the key value intoVec<u8>
. Could be great ifVec<T>
was usable as 'plain' key, without making it intoVec<u8>
, probably with some limitations to T. Such limitations could probably be some kind ofTrieKeyItem
trait, meaning that a type can be a single item of a key makingTrieKey
intoTrieKey<Item=Something>
. This brings alot of convenience and univerality to the datastructure. Strings, for example could be stored as bytes or unicode symbols depending of user needs. TheVec<u8>
in this case could be made into some external thing and only be used for types that could not be easily converted into corresponding vector implementation.It seems to me that first step could be to "just" convert u8 to some internal generic type, probably not seen to user, but being replaced to Item type when the Vec(Iterable?, Indexable?) has been provided as a key.
The fallback: In the case the idea below is impossible(in the context of this library of course), we could probably try implementing
TrieKey
forVec<T: TrieKey>
at least. That will probably require some unneeded computation when converting from/into such vectors but will still be very useful.