maciejhirsz / beef

Faster, more compact implementation of std::borrow::Cow
https://crates.io/crates/beef
Apache License 2.0
338 stars 17 forks source link

Beef::owned_ptr is (probably) unsound as written #7

Closed CAD97 closed 4 years ago

CAD97 commented 4 years ago

because there is a semantic difference between &mut self as *mut Self and Self::into_raw(self). The API Beef uses should at the very least be forwards compatible with being implemented in terms of into_raw_parts, which will also help safeguard against misuse. I would suggest functions isomorphic to into_raw(owned: Self::Owned) -> (NonNull<Self>, Option<NonZeroUsize>) and from_raw(ptr: NonNull<Self>, cap: NonZeroUsize) -> Self::Owned.

To be fair, this is probably a lot more benign than the Rc case which used &T as *const T, but this is still problematic. IIUC, the actual UB under Stacked Borrows would come when dropping the owned value, as your pointer no longer actually owns the place it points to, and merely borrows it. Except... mem::forget counts as a use of the pointer per SB, I think. cc @RalfJung if you don't mind?

Roughly, and abusing the SB notation, here's the operations I think go on:

So I've talked myself from "(probably)" to "(potentially)", but I'd still advise changing the API surface forwards compatible with the definitely-not unsound into_raw_parts. At the very least, it avoids a misuse vector for the API.

With apologies to the Stacked Borrow team for my abuse of their notation above.

maciejhirsz commented 4 years ago

So if I understand it all correctly this should be a sufficient fix until into_raw_parts is stabilized?

    #[inline]
    unsafe fn owned_ptr(owned: Vec<T>) -> NonNull<[T]> {
        let mut owned = mem::ManuallyDrop::new(owned);
        let ptr = owned.as_mut_ptr();
        let len = owned.len();

        NonNull::new_unchecked(slice_from_raw_parts_mut(ptr, len))
    }

Edit: mem::forget to mem::ManuallyDrop, though I don't know that it makes any difference.

CAD97 commented 4 years ago

Given the current implementation of .into_raw_parts() uses .as_mut_ptr(), yeah, I think that's the best we can do on stable. I initially worried our version was calling <[_]>::as_mut_ptr, but Vec::as_mut_ptr does exist and is public.

The main advantage of using ManuallyDrop over mem::forget is that it sidesteps the question of whether mem::forget counts as a use. mem::forget is implemented in terms of ManuallyDrop, anyway. In addition, ManuallyDrop seems to be favored for new code over using mem::forget, and it more clearly communicates that you're preparing to tear the value apart when you wrap it in ManuallyDrop.

maciejhirsz commented 4 years ago

Cool, also changing the trait to:

pub unsafe trait Beef: ToOwned {
    fn owned_into_parts(owned: Self::Owned) -> (NonNull<Self>, Option<NonZeroUsize>);

    unsafe fn owned_from_parts(ptr: NonNull<Self>, capacity: NonZeroUsize) -> Self::Owned;
}

capacity and owned_ptr are rolled into owned_into_parts, and rebuild is renamed to owned_from_parts. Functions are symmetric and easily mapped to into_raw_parts / from_raw_parts, so it should be easy to understand what's happening here.

RalfJung commented 4 years ago

Looks like I missed the action. ;)

mem::forget counts as a use of the pointer per SB, I think.

Yes. Passing a value to a function or returning it from a function is a "use" at the given type, such values must be valid for that type.

Also see https://github.com/rust-lang/rust/pull/69618: mem::forget should be avoided when dealing with memory (as opposed to, e.g., deliberately "leaking" a file descriptor).

maciejhirsz commented 4 years ago

Also see rust-lang/rust#69618: mem::forget should be avoided when dealing with memory (as opposed to, e.g., deliberately "leaking" a file descriptor).

Makes sense, thanks for clarifying!