michaelsproul / rust_radix_trie

Fast generic radix trie implemented in Rust
https://docs.rs/radix_trie/
MIT License
184 stars 32 forks source link

remove() panics on non-existing element #40

Closed thomaskrause closed 5 years ago

thomaskrause commented 6 years ago

I get a panic when I try to run the following minimal test case with radix_trie 0.1.3:

extern crate radix_trie;
use radix_trie::Trie;

fn main() {
    let mut trie : Trie<String,bool> = Trie::new();
    trie.insert("token_rst".to_owned(), true);

    println!("{:?}", trie.remove("syntax")); // this line crashes
}
thread 'main' panicked at 'attempted access beyond vector end. len is 12, index is 18', /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/nibble_vec-0.0.4/src/lib.rs:74:13
[...]
  7: nibble_vec::NibbleVec::get
             at /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/nibble_vec-0.0.4/src/lib.rs:74
   8: radix_trie::traversal::rec_remove
             at /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/radix_trie-0.1.3/src/traversal.rs:197
   9: radix_trie::traversal::recursive_remove
             at /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/radix_trie-0.1.3/src/traversal.rs:169
  10: radix_trie::traversal::<impl radix_trie::trie_node::TrieNode<K, V>>::remove
             at /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/radix_trie-0.1.3/src/traversal.rs:27
  11: radix_trie::trie::<impl radix_trie::Trie<K, V>>::remove
             at /homtioe/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/radix_trie-0.1.3/src/trie.rs:54
  12: trie_bug::main
             at src/main.rs:8

This is specifically triggered by the combination of the values token_rst and syntax. As far as I understand the documentation and by trying other values , a simple None should be returned as result of removing the non-existant value.

I tried to debug this but can't figure out yet why this combination of insert/remove crashes.

thomaskrause commented 6 years ago

Further debugging leads to a more simplified test case using the values tand s, which panics at check_keys(...) instead. https://github.com/michaelsproul/rust_radix_trie/blob/c34a167f424db35583612b61051d8441e35f5f2a/src/keys.rs#L71

extern crate radix_trie;
use radix_trie::Trie;

fn main() {
    let mut trie : Trie<String,bool> = Trie::new();
    trie.insert("t".to_owned(), true);

    println!("{:?}", trie.remove("s")); // this line crashes
}
thread 'main' panicked at 'multiple-keys with the same bit representation.', /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/radix_trie-0.1.3/src/keys.rs:71:9
stack backtrace:
[...]
6: radix_trie::keys::check_keys
             at ./<panic macros>:3
   7: <radix_trie::trie_node::TrieNode<K, V>>::take_value::{{closure}}
             at /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/radix_trie-0.1.3/src/trie_node.rs:160
   8: <core::option::Option<T>>::map
             at /checkout/src/libcore/option.rs:414
   9: <radix_trie::trie_node::TrieNode<K, V>>::take_value
             at /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/radix_trie-0.1.3/src/trie_node.rs:159
  10: radix_trie::traversal::recursive_remove
             at /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/radix_trie-0.1.3/src/traversal.rs:156
  11: radix_trie::traversal::<impl radix_trie::trie_node::TrieNode<K, V>>::remove
             at /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/radix_trie-0.1.3/src/traversal.rs:27
  12: radix_trie::trie::<impl radix_trie::Trie<K, V>>::remove
             at /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/radix_trie-0.1.3/src/trie.rs:54
  13: trie_bug::main
             at src/main.rs:8

From my understanding of the check_keys(...) function it should not be called in this context?

michaelsproul commented 6 years ago

Thanks for reporting, I can reproduce both crashes locally and will investigate the root cause

michaelsproul commented 6 years ago

I've made a WIP patch that I think solves the two issues you found, if you could try it out and let me know what you think, that'd be much appreciated: https://github.com/michaelsproul/rust_radix_trie/compare/remove-non-existent

I'm tired now, so I've probably missed something important. I'm tempted to clean up the duplication of logic in recursive_remove and rec_remove, but don't have the energy to work it out now

thomaskrause commented 6 years ago

Thank you very much for your help! The changes in the branch make sense as far as I can see. My access pattern to the trie is a bit uncommon because I always attempt to remove a value before I (re-)insert it again (that also means I can change the code to check for existence first to work-around this issue).

I ran the the new branch on my code on a larger data set with strings I need to index and most string sets work fine, but I found another example of strings that will trigger the multiple-keys with the same bit representation panic even in the updated branch:

extern crate radix_trie;
use radix_trie::Trie;

fn main() {
    let mut trie : Trie<String,bool> = Trie::new();
    trie.insert("AA".to_owned(), true);
    trie.insert("AB".to_owned(), true);

    println!("{:?}", trie.remove("BA")); // this line crashes
}

I think the rec_remove function might need an additional check for the key, not just for the length.

michaelsproul commented 6 years ago

Ah yeah, you're right. I thought that was suspicious that recursive_remove didn't call match_keys. I mustn't have been thinking straight when I wrote that. I've hacked the code on the remove-non-existent branch a bit more, to what I think should be right, but I still need to think about it in some more depth...