rust-lang / miri

An interpreter for Rust's mid-level intermediate representation
Apache License 2.0
4.39k stars 340 forks source link

passing self referential struct by value creates strong protector #3456

Closed leddoo closed 5 months ago

leddoo commented 6 months ago

i have a self referential Builder. something like this:

struct Builder {
    data: Vec<u8>,
    // points into `data`, only used internally.
    some_slice: &'static [u8],
}

impl Builder {
    fn new(data: &[u8]) -> Builder {
        let data = Vec::from(data);

        let some_slice = unsafe {
            let the_slice = &data[0..4];
            core::mem::transmute::<&[u8], &[u8]>(the_slice)
        };

        return Builder { data, some_slice };
    }

    fn build(self) {
        drop(self.data);
    }
}

fn main() {
    Builder::new(b"1234hello").build();
}

(the real thing uses a bump allocator and some_slice is a "leaked" allocation in that allocator)

the problem is, dropping self.data in build triggers undefined behavior:

error: Undefined Behavior: not granting access to tag <3378> because that would remove [SharedRead
Only for <3599>] which is strongly protected because it is an argument of call 865
   --> /Users/leddoo/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/
core/src/ptr/mod.rs:514:1
    |
514 | pub unsafe fn drop_in_place<T: ?Sized>(to_drop: *mut T) {
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ not granting access to tag <3378> be
cause that would remove [SharedReadOnly for <3599>] which is strongly protected because it is an a
rgument of call 865
    |
help: <3378> was created here, as the base tag for alloc1476
   --> src/main.rs:9:20
    |
9   |         let data = Vec::from(data);
    |                    ^^^^^^^^^^^^^^^
help: <3599> is this argument
   --> src/main.rs:20:14
    |
20  |     fn build(this: Self) {

(in the real code, build returns a Result, and on error, the bump allocator is dropped, while "some_slice" protects the bump allocator's memory)

i've read stacked borrows section 4.1, so i know what's going on. best thing i've come up with so far would be an UnprotectedRefs wrapper type, that just tells miri to not create stack protectors for references contained in that wrapper (and tells the compiler to not rely on those references being valid for the duration of the call).

right now, the only workaround i've found is boxing that slice, which isn't ideal.

is there anything like UnprotectedRefs or a different workaround on stable right now?

leddoo commented 6 months ago

okay, i've found a wacky workaround: an UnprotectedRefs type can be created using a union:

use std::mem::ManuallyDrop;

union UnprotectedRefs<T> {
    inner: ManuallyDrop<T>,
}

impl<T> UnprotectedRefs<T> {
    pub fn new(inner: T) -> Self {
        Self { inner: ManuallyDrop::new(inner) }
    }
}

struct Builder {
    data: Vec<u8>,
    // points into `data`, only used internally.
    some_slice: UnprotectedRefs<&'static [u8]>,
}

impl Builder {
    fn new(data: &[u8]) -> Builder {
        let data = Vec::from(data);

        let some_slice = UnprotectedRefs::new(unsafe {
            let the_slice = &data[0..4];
            core::mem::transmute::<&[u8], &[u8]>(the_slice)
        });

        return Builder { data, some_slice };
    }

    fn build(self) {
        drop(self.data);
    }
}

fn main() {
    Builder::new(b"1234hello").build();
}

this will probably work for me. feel free to close this issue if you have nothing to add.

saethlin commented 6 months ago

I don't think I've seen this workaround before. Interesting. I probably would have addressed this by using a raw slice and providing an accessor function to get the reference slice back.

leddoo commented 6 months ago

I don't think I've seen this workaround before. Interesting. I probably would have addressed this by using a raw slice and providing an accessor function to get the reference slice back.

oh right, yeah, that's probably the simplest solution for the repro 👍

in the real code, there are a few more slices haha

#[derive(Clone, Debug, Default)]
pub struct Module<'a> {
    pub types:      &'a KSlice<TypeIdx, FuncType<'a>>,
    pub imports:    Imports<'a>,
    pub funcs:      &'a [TypeIdx],
    pub tables:     &'a [TableType],
    pub memories:   &'a [MemoryType],
    pub globals:    &'a [Global],
    pub exports:    &'a [Export<'a>],
    pub start:      Option<FuncIdx>,
    pub elements:   &'a [Element<'a>],
    pub code_section: &'a [u8],
    pub codes:      &'a [Code<'a>],
    pub datas:      &'a [Data<'a>],
    pub customs:    &'a [CustomSection<'a>],
}

i might still end up boxing it though, cause moving something of that size in and out of builder functions doesn't sound like a great idea. though it's not exactly on the hot path, so the UnprotectedRefs works for now.

for anyone else stumbling into this issue, here's my impl:

union UnprotectedRefs<T> {
    inner: core::mem::ManuallyDrop<T>,
}

impl<T> UnprotectedRefs<T> {
    #[inline(always)]
    const fn new(value: T) -> Self {
        UnprotectedRefs { inner: core::mem::ManuallyDrop::new(value) }
    }

    #[inline(always)]
    fn into_inner(self) -> T {
        let mut this = core::mem::ManuallyDrop::new(self);
        unsafe { core::mem::ManuallyDrop::take(&mut this.inner) }
    }
}

impl<T> core::ops::Deref for UnprotectedRefs<T> {
    type Target = T;

    #[inline(always)]
    fn deref(&self) -> &Self::Target {
        unsafe { &*self.inner }
    }
}

impl<T> core::ops::DerefMut for UnprotectedRefs<T> {
    #[inline(always)]
    fn deref_mut(&mut self) -> &mut Self::Target {
        unsafe { &mut *self.inner }
    }
}

impl<T> Drop for UnprotectedRefs<T> {
    #[inline(always)]
    fn drop(&mut self) {
        unsafe { core::mem::ManuallyDrop::drop(&mut self.inner) }
    }
}
RalfJung commented 5 months ago

Miri is working as intended here, your original code has UB.

What you are looking for is something like https://github.com/rust-lang/rfcs/pull/3336. A union also works, indeed.

Closing as there's no Miri issue here.