Closed jonas-schievink closed 3 years ago
When realloc
is called, an allocation is moved from old_ptr
to new_ptr
, but dhat-rs thinks that there is already an allocation at new_ptr
. This is impossible (assuming the allocator isn't buggy), so there must be a problem with dhat-rs's recording of allocations.
Here's how I would debug this:
eprintln!("alloc {:?}", ptr);
in DhatAlloc::alloc
.eprintln!("realloc {:?}, {:?}", old_ptr, new_ptr");
in DhatAlloc::realloc
.eprintln!("dealloc {:?}", ptr);
in DhatAlloc::dealloc
.new_ptr
an address that has been seen before?Note: all three of the eprintln!
statements must be within the ignore_allocs
closure.
ThreadId(1) alloc 0x55e8c4282b10 (Layout { size_: 4, align_: 1 })
ThreadId(1) realloc 0x55e8c4282b10 -> 0x55e8c4282b10 (Layout { size_: 4, align_: 1 })
ThreadId(1) alloc 0x55e8c4282ba0 (Layout { size_: 48, align_: 8 })
ThreadId(1) alloc 0x55e8c4282be0 (Layout { size_: 15, align_: 1 })
ThreadId(1) dealloc 0x55e8c4282be0 (Layout { size_: 15, align_: 1 })
thread '<unnamed>' panicked at 'assertion failed: c.borrow().is_none()', library/std/src/sys_common/thread_info.rs:40:26
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
ThreadId(1) alloc 0x55e8c4282be0 (Layout { size_: 16, align_: 8 })
ThreadId(1) alloc 0x55e8c4282c00 (Layout { size_: 80, align_: 8 })
fatal runtime error: failed to initiate panic, error 5
This makes no sense, wow
(a lot of unrelated operations)
realloc 0x7f60f00930c0 -> 0x7f60f00b28c0 (Layout { size_: 1536, align_: 8 }, new size = 3072)
(a lot of unrelated operations)
realloc 0x7f60f000ff40 -> 0x7f60f00b28c0 (Layout { size_: 76, align_: 1 }, new size = 152)
thread '<unnamed>' panicked at '0x7f60f00b28c0', /home/jonas/dev/dhat-rs/src/lib.rs:315:17
stack backtrace:
dealloc 0x7f60f00b28c0 (Layout { size_: 3072, align_: 8 })
0: rust_begin_unwind
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/panicking.rs:483
1: std::panicking::begin_panic_fmt
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/panicking.rs:437
2: dhat::ignore_allocs
3: <dhat::DhatAlloc as core::alloc::global::GlobalAlloc>::realloc
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
It seems that the assertion is failing correctly, but I have no idea why this would happen.
I can reproduce this. There is some randomness to it. Sometimes it asserts within Dhat::realloc
, sometimes it asserts on the equivalent assertion within Dhat::alloc
. I did three runs with eprintln!
s in it. The first and third runs hit the assertion failure after 169,588 and 162,800 lines of output, respectively; the second run didn't have any problems and I interrupted it after 2,202,676 lines of output.
Partial output from the first run, which asserted in Dhat::realloc
:
realloc 0x7f9dc146ee80 Layout { size_: 43, align_: 1 } 86 -> 0x7f9dc143cbf0
...
realloc 0x7f9dc146ee80 Layout { size_: 46, align_: 1 } 92 -> 0x7f9dc143cbf0
thread 'dealloc 0x7f9dc143cbf0 Layout { size_: 86, align_: 1 }
<unnamed>' panicked at 'assertion failed: matches!(old, None)', /Users/njn/moz/dhat-rs/src/lib.rs:317:17
Partial output from the first run, which asserted in Dhat::alloc
:
alloc Layout { size_: 81, align_: 1 } -> 0x7fa957d44da0
...
alloc Layout { size_: 100, align_: 1 } -> 0x7fa957d44da0
dealloc 0x7fa957d44da0 Layout { size_thread ': 81, align_: 1 }
main' panicked at 'assertion failed: matches!(old, None)', /Users/njn/moz/dhat-rs/src/lib.rs:270:17
In both cases (and in your output above) the panic output is interleaved with output from a dealloc
. And in all cases, if that dealloc took place one operation earlier, then the alloc
/realloc
would be ok. This makes me suspicious. Something smells racy, but I'm not sure if it's in dhat-rs, or whether dhat-rs is exposing some pre-existing race...
Hmm, the above may be a red herring, because I put my eprintln!
statements outside of getting the global lock.
realloc
is complex and often a cause of bugs so I tried commenting out Dhat::realloc
—which is legit because it is a provided method—but I still get assertion failures within Dhat::alloc
, so it's not a realloc
-specific bug.
Aha, the data race idea is not a red herring. There is insufficient atomicity within Dhat
's methods. So it's possible to have a sequence like this:
System.dealloc
frees p
System.alloc
allocates (recycles) p
alloc
dealloc
But dhat-rs gets confused and asserts on the third step.I have a draft patch that fixes things and seems to avoid the problem. I will clean it up and merge it later today.
After applying this patch to https://github.com/rust-analyzer/rust-analyzer/commit/3df4b8c1fa4c1686228162bff03e4db3f01b9826:
Running this command:
...results in different panics and assertion failures:
Assertion failure and double free
``` thread 'Double (triple?) panic
``` thread 'Any idea what I could have unearthed here?