Closed andrewcsmith closed 8 years ago
Hi Andrew!
I don't think using a traversal trait is right for your use case. Traversals are a really just a low-level primitive I created to avoid code duplication between get/insert/remove. I think you really want something more like this?
trie.get_node(&key).unwrap()
.child_iter()
.fold(0, |acc, child| acc + child.value().unwrap_or(0))
The caveat here is that the child_iter
method is not currently public. I'm concerned over the unsoundness of returning subtries with half their key missing (see #7), so the robust solution as I see it is:
child_iter
method that returns an iterator over the child subtries.Now, depending on how adventurous crazy you feel, you could have a go at implementing this yourself, or you could use the hacky-child-iter branch, where I just made the child_iter
method public.
ONE further insane thing is that due to the use of 4-bit branching at each level, the definition of "child" in this radix trie may not be what you're after. If you take a look at the child_iter.rs example, the computed result is oddly 0, because a bit prefix of 96 (or 0x60) is given its own node above "a" (0x61)
, "b" (0x62)
and "c" (0x63)
. I am unsure how to "fix" this, given that the data structure is inherently binary-focussed (for better and worse).
One thing we could easily do is define a fold
function over the entire trie, but this can already be accomplished using .values().fold()
, and I'm guessing isn't what you want.
In a trie with keys "a", "b", "c"
, the first child is:
Trie {
key: NibbleVec [6],
key_value: None,
length: 3,
child_count: 3,
children: [
None,
Some(
Trie {
key: NibbleVec [1],
key_value: Some(
KeyValue {
key: "a",
value: 5
}
),
length: 1,
child_count: 0,
children: [
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None
]
}
),
Some(
Trie {
key: NibbleVec [2],
key_value: Some(
KeyValue {
key: "b",
value: 6
}
),
length: 1,
child_count: 0,
children: [
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None
]
}
),
Some(
Trie {
key: NibbleVec [3],
key_value: Some(
KeyValue {
key: "c",
value: 50
}
),
length: 1,
child_count: 0,
children: [
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None
]
}
),
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None,
None
]
}
Ah, I see what issue #7 is about now. What about instead of calling it SubTrie
we just call it TrieRoot
, and therefore a "normal" Trie
will just have a prefix of NibbleVec []
?
Another question–in the 4-bit branching, will all the child nodes of a similar length (that is, all the length-3 children of "ab": "abc", "abd", "abe", etc.) be on the same hierarchical level? I need to non-recursively visit all nodes containing kv pairs, prefixed by some search string "ab"
. I think I've actually made some progress on the Traversal
front. Check this out:
fn child_match_fn(trie: &'a Trie<K, u32>, input: Self::Input, nv: NibbleVec) -> Self::Output {
println!("children {:?}, {:?}, {:?}", trie.key(), trie.value(), nv);
// Children code goes here
for key in trie.keys() {
// println!("{:?}", key);
match input.get_node(&key) {
Some(k) => {
Self::run(&input, &k, NibbleVec::from_byte_vec(key.encode()));
}
None => { }
}
}
trie.value().unwrap_or(&0u32).clone()
}
The problem is that this visits some nodes twice, and I need to not recurse. But perhaps it's getting closer, without making anything extra public?
Greetings from Santa Cruz!
Ah, I see what issue #7 is about now. What about instead of calling it SubTrie we just call it TrieRoot, and therefore a "normal" Trie will just have a prefix of NibbleVec [] ?
Good idea!
I think I've got something of a solution for the 4-bit branching problem. We just need a function (say get_raw_ancestor
) that returns the closest ancestor of a given key, regardless of whether or not that ancestor node has a value. I've implemented this on a branch, but I had to change the traversal logic, and it broke most of the unit tests, so I'll have to have a closer look a bit later.
See: https://github.com/michaelsproul/rust_radix_trie/blob/traversal-hacks/src/lib.rs#L385
Don't worry too much about making extra things public - the library hasn't really been used by anyone extensively, so the API is far from complete, and some growing pains are to be expected.
It's nice to hear from someone in Santa Cruz! I really enjoyed living there on exchange during 2014/2015 :smile:
Perhaps I've just been staring at this too long—but I can't figure out where the recursion happens in the RefTraversal::run
method. I only see one invocation of Self::run
, but it seems that recursion is happening whenever the run
method returns a None
value. Where is that called?
Yeah, it's not recursive in the sense that it traverses the whole tree, it just uses recursion to reach the spot where a single key should be located, if that makes sense...?
Ok, I've updated the master
branch with a working get_raw_ancestor
function and a public child_iter
iterator. Let me know if these functions allow you to do what you want to do. There's an example in examples/child_iter.rs
that you might find useful.
It's by no means a perfect fix, so I won't do a release yet, but you can depend upon this version by adding a Git dependency to your Cargo manifest:
[dependencies]
radix_trie = { git = "https://github.com/michaelsproul/rust_radix_trie.git" }
(for more info on Cargo stuff like this: http://doc.crates.io/manifest.html#the-dependencies-section)
Excellent. I'll take a look in a sec. In the meantime, I just finished up a fix of my own that allows for returning a node with the root node intact. Definitely a WIP, but it required minimal changes to the existing code and didn't break any tests.
Ooh! Looks great! I'll dig into it soon (I've got a lot of uni work atm).
Done
I'm looking at your traversal traits, and attempting to write a struct that visits all the immediate children of a node and adds their values. However, given that the
children
field is private, I can't figure out how to make my struct traverse the child nodes. Is there something in the current interface that allows this, or would I have to change the actual crate code?My
RefTraversal
code right now (Trie essentially stored as<&str, u32>
):