Conflux-Chain / conflux-rust

The official Rust implementation of Conflux protocol. https://doc.confluxnetwork.org
https://doc.confluxnetwork.org
GNU General Public License v3.0
657 stars 196 forks source link

cfx-storage tests crash when using rust 1.52 #2168

Closed Thegaram closed 2 years ago

Thegaram commented 3 years ago
$ cargo test --release -p cfx-storage --lib
    Finished release [optimized] target(s) in 2.87s
     Running unittests (target/release/deps/cfx_storage-0f492240c4fe8242)

running 56 tests
test impls::delta_mpt::cache::algorithm::tests::removable_heap::test_corner_cases ... ok
test impls::delta_mpt::cache::algorithm::tests::recent_lfu::r_lfu_algorithm_smoke_test ... ok
test impls::delta_mpt::mem_optimized_trie_node::test_mem_optimized_trie_node_size ... ok
test impls::delta_mpt::cache::algorithm::tests::lru::test_lru_algorithm::test_lru_algorithm ... ok
error: test failed, to rerun pass '-p cfx-storage --lib'

Caused by:
  process didn't exit successfully: `/Users/peter/work/repos/conflux/conflux-rust/target/release/deps/cfx_storage-0f492240c4fe8242` (signal: 5, SIGTRAP: trace/breakpoint trap)

These tests fine with version 1.51. Also work fine in debug mode.

Based on the changelog, the latest version "upgraded the default LLVM to LLVM 12", not sure if that would be causing any issues. Other than this it does not seem to have any relevant changes.

Thegaram commented 3 years ago

This seems to be the minimal example to reproduce the crash:

#[test]
/// When children_count is 0, the table_ptr must be null.
/// If the table_ptr is an "empty array" there could be memory leak.
fn test_no_alloc_in_empty_children_table() {
    let empty_children_table = ChildrenTableDeltaMpt::default();
    empty_children_table.assert_no_alloc_in_empty_children_table();
    let cloned_table = empty_children_table.clone();
    // cloned_table.assert_no_alloc_in_empty_children_table();

    let rlp_bytes = empty_children_table.to_ref().rlp_bytes();
    let rlp_parsed =
        ChildrenTableManagedDeltaMpt::decode(&Rlp::new(rlp_bytes.as_slice()))
            .unwrap();
    // let decoded_table = ChildrenTableDeltaMpt::from(rlp_parsed);
    // decoded_table.assert_no_alloc_in_empty_children_table();

    let index = 3;
    let one_element_table = unsafe {
        ChildrenTableDeltaMpt::insert_child_unchecked(
            empty_children_table.to_ref(),
            index,
            default_children_value(index),
        )
    };
    // let cleared_table = unsafe {
    //     ChildrenTableDeltaMpt::delete_child_unchecked(
    //         one_element_table.to_ref(),
    //         index,
    //     )
    // };
    // cleared_table.assert_no_alloc_in_empty_children_table();
}
Thegaram commented 3 years ago

Confirmed that this error still exists on rust 1.53.0 and on nightly. Moreover, it has been reproduced on Mac (M1) and Linux. To reproduce, run:

$ cargo test --release --package cfx-storage -- --nocapture test_no_alloc_in_empty_children_table

(It only happens under release build.)

$ RUSTFLAGS=-g cargo test --release --package cfx-storage -- --nocapture test_no_alloc_in_empty_children_table
...
    Finished release [optimized] target(s) in 1m 47s
     Running unittests (target/release/deps/cfx_storage-a1365888dc0f80ae)

running 1 test
error: test failed, to rerun pass '-p cfx-storage --lib'

Caused by:
  process didn't exit successfully: `/Users/peter/work/repos/conflux/conflux-rust/target/release/deps/cfx_storage-a1365888dc0f80ae --nocapture test_no_alloc_in_empty_children_table` (signal: 5, SIGTRAP: trace/breakpoint trap)
$ lldb /Users/peter/work/repos/conflux/conflux-rust/target/release/deps/cfx_storage-a1365888dc0f80ae
(lldb) target create "/Users/peter/work/repos/conflux/conflux-rust/target/release/deps/cfx_storage-a1365888dc0f80ae"
Current executable set to '/Users/peter/work/repos/conflux/conflux-rust/target/release/deps/cfx_storage-a1365888dc0f80ae' (arm64).
(lldb) r
Process 33121 launched: '/Users/peter/work/repos/conflux/conflux-rust/target/release/deps/cfx_storage-a1365888dc0f80ae' (arm64)

running 56 tests
test impls::delta_mpt::mem_optimized_trie_node::test_mem_optimized_trie_node_size ... ok
test impls::delta_mpt::cache::algorithm::tests::removable_heap::test_corner_cases ... ok
test impls::delta_mpt::cache::algorithm::tests::recent_lfu::r_lfu_algorithm_smoke_test ... ok
test impls::delta_mpt::cache::algorithm::tests::lru::test_lru_algorithm::test_lru_algorithm ... ok
cfx_storage-a1365888dc0f80ae was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 33121 stopped
* thread #9, stop reason = EXC_BREAKPOINT (code=1, subcode=0x10006153c)
    frame #0: 0x000000010006153c cfx_storage-a1365888dc0f80ae`core::ops::function::FnOnce::call_once::h6d8ae2ac822707ea at snapshot.rs:303:1 [opt]
   300  
   301  #[cfg(test)]
   302  #[test]
-> 303  fn test_delete_all() {
   304      let mut rng = get_rng_for_test();
   305      let keys: Vec<Vec<u8>> = generate_keys(TEST_NUMBER_OF_KEYS)
   306          .iter()
Target 0: (cfx_storage-a1365888dc0f80ae) stopped.
(lldb) bt
* thread #9, stop reason = EXC_BREAKPOINT (code=1, subcode=0x10006153c)
  * frame #0: 0x000000010006153c cfx_storage-a1365888dc0f80ae`core::ops::function::FnOnce::call_once::h6d8ae2ac822707ea at snapshot.rs:303:1 [opt]
    frame #1: 0x000000010010f5d8 cfx_storage-a1365888dc0f80ae`test::__rust_begin_short_backtrace::hb37c1375c9385f04 [inlined] core::ops::function::FnOnce::call_once::h7b6c5682ec416c23 at function.rs:227:5 [opt]
    frame #2: 0x000000010010f5d4 cfx_storage-a1365888dc0f80ae`test::__rust_begin_short_backtrace::hb37c1375c9385f04 at lib.rs:577 [opt]
    frame #3: 0x000000010010e534 cfx_storage-a1365888dc0f80ae`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::h2a1c4dcf93d8a6f6 [inlined] _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h05850620c9990cb5 at boxed.rs:1546:9 [opt]
    frame #4: 0x000000010010e528 cfx_storage-a1365888dc0f80ae`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::h2a1c4dcf93d8a6f6 [inlined] _$LT$std..panic..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h771532d657af2898 at panic.rs:344 [opt]
    frame #5: 0x000000010010e528 cfx_storage-a1365888dc0f80ae`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::h2a1c4dcf93d8a6f6 [inlined] std::panicking::try::do_call::h36d92a6827ccf5f9 at panicking.rs:379 [opt]
    frame #6: 0x000000010010e528 cfx_storage-a1365888dc0f80ae`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::h2a1c4dcf93d8a6f6 [inlined] std::panicking::try::h4f421d269dd4b03c at panicking.rs:343 [opt]
    frame #7: 0x000000010010e528 cfx_storage-a1365888dc0f80ae`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::h2a1c4dcf93d8a6f6 [inlined] std::panic::catch_unwind::h8db96c3d51738648 at panic.rs:431 [opt]
    frame #8: 0x000000010010e528 cfx_storage-a1365888dc0f80ae`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::h2a1c4dcf93d8a6f6 [inlined] test::run_test_in_process::hef66e690fb6e78cd at lib.rs:600 [opt]
    frame #9: 0x000000010010e49c cfx_storage-a1365888dc0f80ae`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::h2a1c4dcf93d8a6f6 at lib.rs:492 [opt]
    frame #10: 0x00000001000ef38c cfx_storage-a1365888dc0f80ae`std::sys_common::backtrace::__rust_begin_short_backtrace::h27345029b4791146 [inlined] test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::hbe144bfedefd819e at lib.rs:519:37 [opt]
    frame #11: 0x00000001000ef24c cfx_storage-a1365888dc0f80ae`std::sys_common::backtrace::__rust_begin_short_backtrace::h27345029b4791146 at backtrace.rs:125 [opt]
    frame #12: 0x00000001000f3760 cfx_storage-a1365888dc0f80ae`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::haa12869dbbc6823a [inlined] std::thread::Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h0b4e2fc6491a03f8 at mod.rs:481:17 [opt]
    frame #13: 0x00000001000f375c cfx_storage-a1365888dc0f80ae`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::haa12869dbbc6823a [inlined] _$LT$std..panic..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h7eed997f2edf2e0a at panic.rs:344 [opt]
    frame #14: 0x00000001000f375c cfx_storage-a1365888dc0f80ae`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::haa12869dbbc6823a [inlined] std::panicking::try::do_call::h48df83c6e66e5f19 at panicking.rs:379 [opt]
    frame #15: 0x00000001000f375c cfx_storage-a1365888dc0f80ae`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::haa12869dbbc6823a [inlined] std::panicking::try::h77529c06b6bc887d at panicking.rs:343 [opt]
    frame #16: 0x00000001000f375c cfx_storage-a1365888dc0f80ae`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::haa12869dbbc6823a [inlined] std::panic::catch_unwind::h564f527558621249 at panic.rs:431 [opt]
    frame #17: 0x00000001000f375c cfx_storage-a1365888dc0f80ae`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::haa12869dbbc6823a [inlined] std::thread::Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h5bff2f0a83ade6b5 at mod.rs:480 [opt]
    frame #18: 0x00000001000f370c cfx_storage-a1365888dc0f80ae`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::haa12869dbbc6823a at function.rs:227 [opt]
    frame #19: 0x0000000100593f30 cfx_storage-a1365888dc0f80ae`std::sys::unix::thread::Thread::new::thread_start::hf57280992ee9aaab [inlined] _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h2ed9c725e692d931 at boxed.rs:1546:9 [opt]
    frame #20: 0x0000000100593f24 cfx_storage-a1365888dc0f80ae`std::sys::unix::thread::Thread::new::thread_start::hf57280992ee9aaab [inlined] _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h136f8928431fcf35 at boxed.rs:1546 [opt]
    frame #21: 0x0000000100593f20 cfx_storage-a1365888dc0f80ae`std::sys::unix::thread::Thread::new::thread_start::hf57280992ee9aaab at thread.rs:71 [opt]
    frame #22: 0x000000019c1c7878 libsystem_pthread.dylib`_pthread_start + 320
(lldb) bt^C
(lldb) ^D
➜  conflux-rust git:(rust-1.52.0) ✗ RUSTFLAGS=-g cargo test --release --package cfx-storage -- --nocapture test_no_alloc_in_empty_children_table

    Finished release [optimized] target(s) in 0.35s
     Running unittests (target/release/deps/cfx_storage-a1365888dc0f80ae)

running 1 test
error: test failed, to rerun pass '-p cfx-storage --lib'

Caused by:
  process didn't exit successfully: `/Users/peter/work/repos/conflux/conflux-rust/target/release/deps/cfx_storage-a1365888dc0f80ae --nocapture test_no_alloc_in_empty_children_table` (signal: 5, SIGTRAP: trace/breakpoint trap)
ChenxingLi commented 3 years ago

This issue originated from incorrect usage of unsafe function std::slice::from_raw_parts. The document says the first parameter data: *const T must be non-null even for zero-length slices. But the code fills the null pointer to this function and triggers undefined behavior.