matklad / once_cell

Rust library for single assignment cells and lazy statics without macros
Apache License 2.0
1.84k stars 110 forks source link

Double free bug involving dereferencing sync::Lazy #105

Closed withoutboats closed 4 years ago

withoutboats commented 4 years ago

So, I may just be missing something very obvious, but I have a really confusing double free bug in my code that is related to once-cell. I'm not saying the bug is in once-cell: it's probably in the unsafe code that I've written - but I wanted to highlight it to you, hoping you have a better idea what caused it and can identify the error.

I was using a Lazy to guard both a static and a static mut; the static mut is initialized also in the initialization of the Lazy. The Lazy is accessed exclusively in the constructor for a type called DemoDriver. In the use after free bug, DemoDriver is defined like this:

struct DemoDriver<'a> {
    sq: &'a Mutex<iou::SubmissionQueue<'static>>,
    // ...
}

fn driver() -> DemoDriver<'static> {
    DemoDriver {
        sq: &SQ,
    }
}

I fixed the double free by changing the definition of DemoDriver:

struct DemoDriver<'a> {
    sq: &'a Lazy<Mutex<iou::SubmissionQueue<'static>>>,
}

Here is the code with the double free bug.

Here is the diff to fix it.

(If you want to run the code and demonstrate the double free, it can be demonstrated by running cargo run --example copy-file.)

Do you understand why the first version causes a double free and the second does not? Sorry that I have not reduced the bug or investigated very thoroughly, I'm quite perplexed.

matklad commented 4 years ago

Looking at it right now. Nothing obviously wrong jumps at me. Though, I need to look at the code more to really understand it :-)

How exactly do I understand that a segfault is happening? If I run the example locally, I get some kind of clean error:

14:50:13|~/projects/ringbahn|not-a-bug⚡?
λ cargo run --release --example copy-file
    Finished release [optimized] target(s) in 0.01s
     Running `target/release/examples/copy-file`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 22, kind: InvalidInput, message: "Invalid argument" }', examples/copy-file.rs:13:19
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

14:50:16|~/projects/ringbahn|not-a-bug⚡?
λ git switch bug 
Switched to branch 'bug'

14:50:19|~/projects/ringbahn|bug⚡?
λ git show HEAD -s --oneline
b1ab61a (HEAD -> bug) Simplify demo driver

14:50:21|~/projects/ringbahn|bug⚡?
λ cargo run --release --example copy-file
   Compiling ringbahn v0.1.0 (/home/matklad/projects/ringbahn)
    Finished release [optimized] target(s) in 0.55s
     Running `target/release/examples/copy-file`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 22, kind: InvalidInput, message: "Invalid argument" }', examples/copy-file.rs:13:19
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

14:50:27|~/projects/ringbahn|bug⚡?
λ uname -r
5.4.41

One thing I find slightly suspicious is that initialization of SQ spawns a new thread, and that new thread can try to access a lazy value before it is fully initialized. This should work fine though (the other thread should just block).

withoutboats commented 4 years ago

Probably you don't have Linux 5.6, which is required to run the example :sweat_smile:. That's why you get the clean error instead of completing the IO and then getting a double free.

Yea the whole thing is quite confusing. And I haven't really tried to produce a smaller example, for example one that doesn't use io-uring. Sorry about that. Here are some things I learned though that you might have insight on:

  1. The double free happens after the destructor of 1 Ring, but before the destructor of the other. From this I infer that somehow this example causes something to be freed both by Ring and some other part of my code.
  2. The double free does not happen if I println in the init function. I'm guessing taking the lock on stdout prevents some sort of race condition that is contributing to the double free.

One thing I find slightly suspicious is that initialization of SQ spawns a new thread, and that new thread can try to access a lazy value before it is fully initialized. This should work fine though (the other thread should just block).

The spawned thread should never access SQ, and I don't believe it does. It does access LOCK, but commenting out everything to do with LOCK doesn't prevent the bug from happening, so I'm pretty sure that's not relevant.

matklad commented 4 years ago

ok, I've upped my kernel to 5.6 (might be a good idea to menttion this version in README), and I reproed the bug. I've looked at the code a bit, and I think this might be the problem/fix:

diff --git a/src/ring/engine.rs b/src/ring/engine.rs
index c361672..1899c0b 100644
--- a/src/ring/engine.rs
+++ b/src/ring/engine.rs
@@ -126,17 +126,21 @@ impl Engine {
         unsafe {
             match self.state {
                 ReadPrepared | ReadSubmitted    => {
-                    let data = self.read_buf.buf.as_mut_ptr();
-                    let len = self.read_buf.buf.len();
+                    let mut buf = std::mem::replace(&mut self.read_buf, Buffer::new());
+                    let data = buf.buf.as_mut_ptr();
+                    let len = buf.buf.len();
+                    std::mem::forget(buf);
                     self.completion.as_ref().cancel(Cancellation::buffer(data, len));
-                    self.read_buf = Buffer::new();
+
                     self.state = Inert;
                 }
                 WritePrepared | WriteSubmitted  => {
-                    let data = self.write_buf.buf.as_mut_ptr();
-                    let len = self.write_buf.buf.len();
+                    let mut buf = std::mem::replace(&mut self.write_buf, Buffer::new());
+                    let data = buf.buf.as_mut_ptr();
+                    let len = buf.buf.len();
+                    std::mem::forget(buf);
                     self.completion.as_ref().cancel(Cancellation::buffer(data, len));
-                    self.write_buf = Buffer::new();
+
                     self.state = Inert;
                 }
                 WriteBuffered                   => {
@@ -214,7 +218,7 @@ impl Drop for Engine {

 unsafe fn prepare_read(
     sqe: iou::SubmissionQueueEvent<'_>,
-    ctx: &mut Context<'_>, 
+    ctx: &mut Context<'_>,
     fd: RawFd,
     buf: &mut [u8],
 ) -> NonNull<Completion> {
@@ -230,7 +234,7 @@ unsafe fn prepare_read(

 unsafe fn prepare_write(
     sqe: iou::SubmissionQueueEvent<'_>,
-    ctx: &mut Context<'_>, 
+    ctx: &mut Context<'_>,
     fd: RawFd,
     buf: &[u8],
 ) -> NonNull<Completion> {

Specifically, this looks like a doube free:

let data = self.write_buf.buf.as_mut_ptr();
let len = self.write_buf.buf.len();
self.completion.as_ref().cancel(Cancellation::buffer(data, len));
self.write_buf = Buffer::new();

Cancellation::buffer will free the memory, but we also just overwrite it in this function, so it feels like a douple drop? Note that I don't actually understand the code, it's an observation based solely on te surface shape of things.

matklad commented 4 years ago

Not sure why Lazy change "fixes" the bug, but I have the theory that it just masks it. Here's an experiemnt to verify this:

diff --git a/src/event/cancellation.rs b/src/event/cancellation.rs
index 33b9ac8..363c8e9 100644
--- a/src/event/cancellation.rs
+++ b/src/event/cancellation.rs
@@ -35,6 +35,7 @@ impl Cancellation {

     pub(crate) unsafe fn buffer(data: *mut u8, len: usize) -> Cancellation {
         unsafe fn callback(data: *mut (), len: usize) {
+            eprintln!("A {:?}", data as *mut u8);
             drop(Vec::from_raw_parts(data as *mut u8, len, len))
         }

diff --git a/src/ring/engine.rs b/src/ring/engine.rs
index c361672..d73858a 100644
--- a/src/ring/engine.rs
+++ b/src/ring/engine.rs
@@ -263,6 +263,13 @@ struct Buffer {
     cap: usize,
 }

+impl Drop for Buffer {
+    fn drop(&mut self) {
+        eprintln!("B {:?}", self.buf.as_mut_ptr());
+    }
+
+}
+
 impl Buffer {
     fn new() -> Buffer {
         Buffer {
16:48:25|~/projects/ringbahn|not-a-bug⚡*?
λ cargo run --release --example copy-file
   Compiling ringbahn v0.1.0 (/home/matklad/projects/ringbahn)
    Finished release [optimized] target(s) in 0.59s
     Running `target/release/examples/copy-file`
B 0x556b6163bb60
B 0x556b6163db70
A 0x556b61641b90 *
B 0x556b61641b90 *
B 0x556b6163fb80
B 0x556b6163bb60

The two pointers marked * are equal. Seems bad :-)

withoutboats commented 4 years ago

Thanks! Your analysis looks right, I'll investigate the bug in my cancellation code (I was expecting there to be a bug there somewhere).

matklad commented 4 years ago

Thanks for nerd-sniping me to update the kernel and finally look closer into ringbahn and io_uring! I enjoyed reading the source!

withoutboats commented 4 years ago

You were right that was the double free, and fixing the bug has eliminated it (withoutboats/ringbahn#16). What remains a mystery to me is why inserting the Lazy masked the bug.

matklad commented 4 years ago

The error comes from some sanity check code in glibc. By adding Lazy you are postponing forcing of the lazy value from construction to the first use, and that might change the order in which allocations happen (because SQ ctor allocates). And that can plausibly confuse glibc sanity check.

Adding printfs to global alloc shows the differens

This asserts:

0x55e0ab73dc80 +
0x55e0ab73dc80 -
0x55e0ab73dc80 +
0x55e0ab73dc80 -
0x55e0ab73dca0 +
0x55e0ab73dcd0 +
0x55e0ab73dd10 +
0x55e0ab73dd70 +
0x55e0ab73dc80 +
0x55e0ab73dda0 +
0x55e0ab741f40 +
0x55e0ab741f40 -
0x55e0ab745fb0 +
0x55e0ab7463f0 +
0x7fc770000b60 +
0x7fc770000bf0 +
0x55e0ab7463f0 -
0x55e0ab7463f0 +
0x55e0ab7463f0 -
0x55e0ab7463f0 +
0x55e0ab745fe0 -
0x55e0ab73df20 -
0x55e0ab73ff30 -
0x55e0ab7463f0 -
0x55e0ab743f70 -   <- first free
0x55e0ab743fc0 +
0x55e0ab7463f0 +
0x55e0ab743f70 -   <- second free

This doesn't:

0x55c7adfd3c80 +
0x55c7adfd3c80 -
0x55c7adfd3c80 +
0x55c7adfd3c80 -
0x55c7adfdbd10 +
0x55c7adfd3c80 +
0x55c7adfd3c80 -
0x55c7adfdc150 +
0x55c7adfdc180 +
0x55c7adfdc1c0 +
0x55c7adfdc220 +
0x55c7adfd3c80 +
0x55c7adfdc250 +
0x55c7adfdc3d0 +
0x7fe4ac000b60 +
0x7fe4ac000bf0 +
0x55c7adfdc3d0 -
0x55c7adfdc3d0 +
0x55c7adfdc3d0 -
0x55c7adfdc3d0 +
0x55c7adfdbd40 -
0x55c7adfd3ca0 -
0x55c7adfd5cb0 -
0x55c7adfdc3d0 -
0x55c7adfd9cd0 -   <- (1) I think this is corresponds to first free...
0x55c7adfd9d00 +
0x55c7adfd9cd0 +   <- (3) ... but we've realloced the chunk in meantime 
0x55c7adfd9cd0 -   <- (2) ... and here we are freeing it ...
0x55c7adfd7cc0 -
0x55c7adfd3ca0 -
matklad commented 4 years ago

This is the code used to print the traces:

use std::{
    alloc::{GlobalAlloc, System},
    sync::atomic::{AtomicBool, AtomicUsize, Ordering},
};

#[global_allocator]
static GLOBAL: TraceDealloc = TraceDealloc;

struct TraceDealloc;

static TRACE: AtomicBool = AtomicBool::new(false);

unsafe impl GlobalAlloc for TraceDealloc {
    unsafe fn alloc(&self, layout: std::alloc::Layout) -> *mut u8 {
        let ptr = System.alloc(layout);
        if TRACE.load(Ordering::Relaxed) {
            println!("{:?} {:?} +", std::thread::current().id(), ptr);
        }
        ptr
    }
    unsafe fn dealloc(&self, ptr: *mut u8, layout: std::alloc::Layout) {
        if TRACE.load(Ordering::Relaxed) {
            println!("{:?} {:?} -", std::thread::current().id(), ptr);
        }
        System.dealloc(ptr, layout)
    }
    unsafe fn alloc_zeroed(&self, layout: std::alloc::Layout) -> *mut u8 {
        System.alloc_zeroed(layout)
    }
    unsafe fn realloc(&self, ptr: *mut u8, layout: std::alloc::Layout, new_size: usize) -> *mut u8 {
        System.realloc(ptr, layout, new_size)
    }
}

fn main() {
    println!("");
    TRACE.store(true, Ordering::Relaxed);
    let mut input: Ring<File> = Ring::new(File::open("props.txt").unwrap());
    let mut output: Ring<File> = Ring::new(File::create("test.txt").unwrap());
    futures::executor::block_on(async move {
        let mut buf = vec![0; 1024];
        let len = input.read(&mut buf).await.unwrap();
        output.write(&mut buf[0..len]).await.unwrap();
        output.flush().await.unwrap();
    });
    TRACE.store(false, Ordering::Relaxed);
}