Shnatsel / libdiffuzz

Custom memory allocator that helps discover reads from uninitialized memory
Apache License 2.0
163 stars 8 forks source link

Segfault with `hashbrown` #10

Open michaelsproul opened 2 years ago

michaelsproul commented 2 years ago

Hey @Shnatsel,

I was trying to work out whether msan was giving me false positives when I happened upon libdiffuzz. It segfaulted immediately, but in a completely different part of the code.

I've isolated a small reproduceable test case here that uses toml and hashbrown to trigger the segfault: https://github.com/michaelsproul/hashbrown-crash

Have you seen segfaults like this before when using libdiffuzz? Is this a type of false positive, or is hashbrown really doing something sketchy with uninitialized memory? The fault seems to happen in an unsafe drop_in_place call, so I'm wondering whether hashbrown does contain some optimisation that assumes uninitialized memory to be 0, or something.

The full backtrace is here for reference:

#0  hashbrown::raw::RawIterRange<(alloc::vec::Vec<alloc::borrow::Cow<str>, alloc::alloc::Global>, alloc::vec::Vec<usize, alloc::alloc::Global>)>::new<(alloc::vec::Vec<alloc::borrow::Cow<str>, alloc::alloc::Global>, alloc::vec::Vec<usize, alloc::alloc::Global>)> (ctrl=0x7ffff7e2c0c8, data=..., len=<optimised out>) at /cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.12.0/src/raw/mod.rs:1862
#1  hashbrown::raw::RawTable<(alloc::vec::Vec<alloc::borrow::Cow<str>, alloc::alloc::Global>, alloc::vec::Vec<usize, alloc::alloc::Global>), alloc::alloc::Global>::iter<(alloc::vec::Vec<alloc::borrow::Cow<str>, alloc::alloc::Global>, alloc::vec::Vec<usize, alloc::alloc::Global>), alloc::alloc::Global> (self=0x7fffffffdbd8) at /cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.12.0/src/raw/mod.rs:945
#2  hashbrown::raw::RawTable<(alloc::vec::Vec<alloc::borrow::Cow<str>, alloc::alloc::Global>, alloc::vec::Vec<usize, alloc::alloc::Global>), alloc::alloc::Global>::drop_elements<(alloc::vec::Vec<alloc::borrow::Cow<str>, alloc::alloc::Global>, alloc::vec::Vec<usize, alloc::alloc::Global>), alloc::alloc::Global> (self=0x7fffffffdbd8) at /cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.12.0/src/raw/mod.rs:603
#3  0x0000555555562ca5 in hashbrown::raw::{impl#17}::drop<(alloc::vec::Vec<alloc::borrow::Cow<str>, alloc::alloc::Global>, alloc::vec::Vec<usize, alloc::alloc::Global>), alloc::alloc::Global> (self=0x7fffffffdbd8)
    at /cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.12.0/src/raw/mod.rs:1801
#4  core::ptr::drop_in_place<hashbrown::raw::RawTable<(alloc::vec::Vec<alloc::borrow::Cow<str>, alloc::alloc::Global>, alloc::vec::Vec<usize, alloc::alloc::Global>), alloc::alloc::Global>> ()
    at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ptr/mod.rs:448
#5  core::ptr::drop_in_place<hashbrown::map::HashMap<alloc::vec::Vec<alloc::borrow::Cow<str>, alloc::alloc::Global>, alloc::vec::Vec<usize, alloc::alloc::Global>, std::collections::hash::map::RandomState, alloc::alloc::Global>>
    () at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ptr/mod.rs:448
#6  core::ptr::drop_in_place<std::collections::hash::map::HashMap<alloc::vec::Vec<alloc::borrow::Cow<str>, alloc::alloc::Global>, alloc::vec::Vec<usize, alloc::alloc::Global>, std::collections::hash::map::RandomState>> ()
    at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ptr/mod.rs:448
#7  toml::de::{impl#0}::deserialize_any<hashbrown_crash::_::{impl#0}::deserialize::__Visitor> (self=0x7fffffffdc68, visitor=...) at /home/michael/.cargo/registry/src/github.com-1ecc6299db9ec823/toml-0.5.9/src/de.rs:244
#8  toml::de::{impl#0}::deserialize_struct<hashbrown_crash::_::{impl#0}::deserialize::__Visitor> (self=0x7fffffffdc68, name=..., fields=..., visitor=...)
    at /home/michael/.cargo/registry/src/github.com-1ecc6299db9ec823/toml-0.5.9/src/de.rs:315
#9  0x000055555555fa3e in hashbrown_crash::_::{impl#0}::deserialize<&mut toml::de::Deserializer> (__deserializer=0x7fffffffdc68) at src/main.rs:3
#10 toml::de::from_str<hashbrown_crash::Input> (s=...) at /home/michael/.cargo/registry/src/github.com-1ecc6299db9ec823/toml-0.5.9/src/de.rs:80
#11 0x000055555555e4cd in hashbrown_crash::main () at src/main.rs:11
michaelsproul commented 2 years ago

My rustc version is rustc 1.61.0 (fe5b13d68 2022-05-18)

Shnatsel commented 2 years ago

Thanks a lot for the reproducible testcase!

That kind of failure indicates either a bug in libdiffuzz, or a bug in hashbrown that would be caught by Address Sanitizer. I'll check it out later today.

Does that code trigger a MSAN failure as well? Given their respective usage figures, I'd prefer to fix hashbrown first and investigate the libdiffuzz bug later.

michaelsproul commented 2 years ago

I just tried with both address sanitizer and memory sanitizer and neither report any issues, so perhaps it's a libdiffuzz bug? That would also be more reassuring than a bug in std::collections::HashMap :sweat_smile:

This is how I ran ASan:

env RUSTFLAGS="-Zsanitizer=address -g" cargo +nightly build -Zbuild-std --target "x86_64-unknown-linux-gnu" --release

And MSan:

env RUSTFLAGS="-Zsanitizer=memory -Zsanitizer-memory-track-origins -g" cargo +nightly build -Zbuild-std --target "x86_64-unknown-linux-gnu" --release

Running the binary then reports no issues:

target/x86_64-unknown-linux-gnu/release/hashbrown-crash

The weird thing is, if I then LD_PRELOAD libdiffuzz.so and run the ASan or MSan binary it doesn't crash any more!

env LD_PRELOAD=/home/michael/Programming/libdiffuzz/target/release/libdiffuzz.so target/x86_64-unknown-linux-gnu/release/hashbrown-crash
# is fine
Shnatsel commented 2 years ago

Yeah, this segfault is most likely a libdiffuzz bug. I'll try to look into it in a few days, but the current iteration of the code isn't even written by me, so chasing the issue in unsafe code is going to be challenging.

Your MSAN invocation looks correct, I've added the same thing to cargo-fuzz and had no issues. So I suggest reporting the original Hashbrown issue to its maintainers. The reproducing example has been very helpful, I suggest using the same format for that report :+1: