rust-lang / wg-allocators

Home of the Allocators working group: Paving a path for a standard set of allocator traits to be used in collections!
http://bit.ly/hello-wg-allocators
205 stars 9 forks source link

Crazy idea: Raw(Box/Vec/Deque)Impl<T> traits #63

Open sollyucko opened 4 years ago

sollyucko commented 4 years ago

Rationale

  1. The current allocator APIs and proposals require using much more unsafe code than necessary.
  2. The current APIs do not allow for type-specific allocators (e.g. slab allocators).
  3. The current APIs for re-allocation always copy the entire block of data as-is.
  4. The current APIs use a fixed set of metadata associated with the pointer (the pointer itself + its alignment).

Core traits

These traits do not currently include a generic way to create instances; this is expected to be done through methods on allocators. Additionally, these traits do not control zeroing; this is expected to depend on the type of the allocator. (Are there any cases where creation/resizing is sometimes non-zeroing and sometimes zeroing, for the same value?)

Instances of these traits should free their memory on drop.

```rust /// # Safety /// - When converted into raw pointers, the references returned by Deref(Mut) must be valid until: /// - `Drop::drop` is called pub unsafe trait RawBoxImpl: DerefMut {} pub unsafe trait RawBoxSizedImpl: RawBoxImpl + Sized { type OfMaybeUninit: RawBoxMaybeUninitImpl; } pub unsafe trait RawBoxMaybeUninitImpl: RawBoxImpl> + Sized { type Init: RawBoxSizedImpl; fn initialize(self, value: T) -> Self::Init; } /// # Safety /// - When converted into raw pointers, the references returned by Deref(Mut) must be valid until: /// - `Drop::drop` is called /// - **`grow` is called** /// - **`shrink` is called** pub unsafe trait RawVecImpl: RawBoxImpl<[MaybeUninit]> { fn grow( &mut self, min_new_capacity: usize, target_capacity: usize, copier: impl FnOnce(&Cell, &Cell), // old/src, new/dest; caller must catch and re-throw panics ) -> Option<()>; fn grow_in_place( &mut self, min_new_capacity: usize, target_capacity: usize, copier: impl FnOnce(&Cell, &Cell), ) -> Option<()>; fn shrink( &mut self, target_capacity: usize, copier: impl FnOnce(&Cell, &Cell), ) -> Option<()>; fn shrink_in_place( &mut self, target_capacity: usize, copier: impl FnOnce(&Cell, &Cell), ) -> Option<()>; } /// This has identical methods to RawVecImpl. The difference is in the contract: /// calls to `*_in_place` may grow/shrink on both ends. /// /// # Safety /// - When converted into raw pointers, the references returned by Deref(Mut) must be valid until: /// - `Drop::drop` is called /// - `grow` is called /// - `shrink` is called /// - **`grow_in_place` is called** /// - **`shrink_in_place` is called** pub unsafe trait RawVecDequeImpl: RawBoxImpl<[MaybeUninit]> { fn grow( &mut self, min_new_capacity: usize, target_capacity: usize, copier: impl FnOnce(&Cell, &Cell), ) -> Option<()>; fn grow_in_place( &mut self, min_new_capacity: usize, target_capacity: usize, copier: impl FnOnce(&Cell, &Cell), ) -> Option<()>; fn shrink( &mut self, target_capacity: usize, copier: impl FnOnce(&Cell, &Cell), ) -> Option<()>; fn shrink_in_place( &mut self, target_capacity: usize, copier: impl FnOnce(&Cell, &Cell), ) -> Option<()>; } ```

Additional trait

This is used by types like Rc and Arc.

```rust /// # Safety /// - `Foo::unfreeze(foo.freeze())` must be equivalent to `foo`. pub unsafe trait Freeze: Sized { type Frozen: Copy + Sized; fn freeze(self) -> Self::Frozen; /// # Safety /// After calling unfreeze, copies of `frozen` must not be used in any way. unsafe fn unfreeze(frozen: Self::Frozen) -> Self; } #[repr(transparent)] pub struct FrozenCell(UnsafeCell); impl FrozenCell { pub fn from_mut(t: &mut T) -> &Self { // SAFETY: // FrozenCell is a repr(C) wrapper around UnsafeCell, // and UnsafeCell is a repr(C) wrapper around T. unsafe { &*(t as *mut T as *const FrozenCell) } } /// # Safety /// - When calling this function, references resulting from calling `deref` /// on any copy of `self` must not exist. /// - After calling this function, `deref` must not be called on any copy /// of `self`. #[allow(clippy::mut_from_ref)] pub unsafe fn assert_unique(&self) -> &mut T { // SAFETY: // The contract of this method disallows aliasing. &mut *self.0.get() } } impl Deref for FrozenCell { type Target = T; fn deref(&self) -> &T { // SAFETY: // Any number of shared references are okay. `assert_unique` is the only // method that produces a `&mut T`, and its contract disallows aliasing. unsafe { &*self.0.get() } } } ```

Usage examples

```rust pub mod boxed { use crate::alloc::RawBoxImpl; use std::marker::PhantomData; use std::mem; use std::ops::{Deref, DerefMut}; pub struct Box(Raw, PhantomData); impl> Box { pub fn new_from_raw(mut raw: Raw, value: T) -> Self { *raw = value; Self(raw, PhantomData) } pub fn into_raw(mut b: Self) -> *mut T { let result = b.0.deref_mut() as *mut T; mem::forget(b); result } pub fn leak<'a>(b: Self) -> &'a mut T where T: 'a, { let ptr = Box::into_raw(b); // SAFETY: // Since `mem::forget` (via `Box::into_raw`) has been called on `b`, and therefore effectively `b.0`, // no safe code can ever cause `Drop::drop` to be called on it. unsafe { &mut *ptr } } } impl> Deref for Box { type Target = T; fn deref(&self) -> &T { self.0.deref() } } impl> DerefMut for Box { fn deref_mut(&mut self) -> &mut T { self.0.deref_mut() } } } pub mod rc { use crate::alloc::{RawBoxMaybeUninitImpl, RawBoxSizedImpl}; use crate::freeze::Freeze; use std::cell::Cell; use std::intrinsics::abort; use std::marker::PhantomData; use std::ops::Deref; use std::ptr; mod internal { // Allow public-in-private due to type bounds use std::cell::Cell; pub struct RcBox { pub(in crate::rc) strong: Cell, pub(in crate::rc) weak: Cell, pub(in crate::rc) value: T, } } use self::internal::RcBox; pub struct Rc(Raw::Frozen, PhantomData) where Raw::Frozen: Deref>; impl>> Rc where Raw::Frozen: Deref>, { pub fn new_init_from_raw(raw_maybe_uninit: Raw::OfMaybeUninit, value: T) -> Rc { let raw = raw_maybe_uninit.initialize(RcBox { strong: Cell::new(1), weak: Cell::new(1), value, }); Self(raw.freeze(), PhantomData) } } impl Rc where Raw::Frozen: Deref>, { #[inline] pub fn strong(&self) -> usize { self.0.strong.get() } #[inline] fn inc_strong(&self) { let strong = self.strong(); if strong == 0 || strong == usize::MAX { abort(); } self.0.strong.set(strong + 1); } #[inline] fn dec_strong(&self) { self.0.strong.set(self.strong() - 1); } #[inline] pub fn weak(&self) -> usize { self.0.weak.get() } #[allow(dead_code)] #[inline] fn inc_weak(&self) { let weak = self.weak(); if weak == 0 || weak == usize::MAX { abort(); } self.0.weak.set(weak + 1); } #[inline] fn dec_weak(&self) { self.0.weak.set(self.weak() - 1); } } impl Clone for Rc where Raw::Frozen: Deref>, { fn clone(&self) -> Self { self.inc_strong(); Self(self.0, PhantomData) } } impl Drop for Rc where Raw::Frozen: Deref>, { fn drop(&mut self) { self.dec_strong(); if self.strong() == 0 { // SAFETY: // Since the *strong* reference count is now zero, we know that there must not // be any other references to the *value* unsafe { ptr::drop_in_place(&self.0.deref().value as *const T as *mut T); } self.dec_weak(); if self.weak() == 0 { // SAFETY: // Since the *weak* reference count is now zero, we know that there must not // be any other references to the *allocation* unsafe { drop(Raw::unfreeze(self.0)) } } } } } } pub mod vec { use crate::alloc::RawVecImpl; use std::marker::PhantomData; use std::mem::MaybeUninit; use std::ptr; pub struct Vec> { buf: Raw, len: usize, phantom: PhantomData, } impl> Vec { pub fn len(&self) -> usize { self.len } pub fn is_empty(&self) -> bool { self.len == 0 } pub fn reserve(&mut self, additional: usize) { let len = self.len; self.buf .grow(self.len + additional, self.len * 2, |old_cell, new_cell| { let old = old_cell.as_slice_of_cells(); let new = new_cell.as_slice_of_cells(); debug_assert!(old.len() >= len); debug_assert!(new.len() >= len + additional); if old.is_empty() { return; } let old_ptr = old_cell.as_ptr(): *mut [MaybeUninit] as *mut T as *const T; let new_ptr = new_cell.as_ptr(): *mut [MaybeUninit] as *mut T; // SAFETY: // - Both pointers were derived from references. // - The length copied is at most the length of the slice. // - `old` will no longer be used after this closure returns. unsafe { ptr::copy(old_ptr, new_ptr, len); } }); } } } ```

See also

sollyucko commented 4 years ago

Playground demo: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=cb2929506d2dc6024b73a7973c026899

sollyucko commented 4 years ago

I was able to implement this API for a typestate-ish-based safe upwards bump allocator:

```rust pub struct UpAlloc<'a, T>(&'a mut [MaybeUninit]); impl<'a, T> UpAlloc<'a, T> { pub fn new(buf: &'a mut [MaybeUninit]) -> Self { UpAlloc(buf) } pub fn alloc_uninit(&'a mut self) -> Option<(UpBox<'a, MaybeUninit>, UpAlloc<'a, T>)> { let (start, remainder) = self.0.split_first_mut()?; Some((UpBox(start), UpAlloc(remainder))) } pub fn alloc_init(&'a mut self, value: T) -> Option<(UpBox<'a, T>, UpAlloc<'a, T>)> { let (start, remainder) = self.0.split_first_mut()?; let start = start.write(value); Some((UpBox(start), UpAlloc(remainder))) } pub fn alloc_slice_uninit( &'a mut self, min_len: usize, target_len: usize, ) -> Option<(UpBox<'a, [MaybeUninit]>, UpAlloc<'a, T>)> { if min_len > self.0.len() { None } else if target_len > self.0.len() { Some((UpBox(self.0), UpAlloc(&mut []))) } else { let (start, remainder) = self.0.split_at_mut(target_len); Some((UpBox(start), UpAlloc(remainder))) } } } pub struct UpBox<'a, T: ?Sized>(&'a mut T); impl<'a, T: ?Sized> Deref for UpBox<'a, T> { type Target = T; fn deref(&self) -> &T { self.0 } } impl<'a, T: ?Sized> DerefMut for UpBox<'a, T> { fn deref_mut(&mut self) -> &mut T { self.0 } } // UNSAFE: // - `self.0` is a reference // - `self.0` is never modified unsafe impl<'a, T> RawBoxImpl for UpBox<'a, T> {} unsafe impl<'a, T> RawBoxSizedImpl for UpBox<'a, T> { type OfMaybeUninit = UpBox<'a, MaybeUninit>; } unsafe impl<'a, T> RawBoxMaybeUninitImpl for UpBox<'a, MaybeUninit> { type Init = UpBox<'a, T>; fn initialize(self, value: T) -> UpBox<'a, T> { UpBox(self.0.write(value)) } } unsafe impl<'a, T> Freeze for UpBox<'a, T> { type Frozen = &'a T; fn freeze(self) -> &'a T { self.0 } unsafe fn unfreeze(frozen: &'a T) -> Self { Self(&mut *(frozen as *const T as *mut T)) } } #[cfg(test)] mod tests { #[test] fn test_rc() { use super::{UpAlloc, UpBox}; use crate::rc::Rc; use std::mem::MaybeUninit; let mut buf: [_; 1024] = MaybeUninit::uninit_array(); let mut alloc = UpAlloc(&mut buf); let (x, _alloc) = alloc.alloc_uninit().unwrap(); Rc::>::new_init_from_raw(x, 123); } } ```

Playground: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=28dcb72cf6b64421b7302b1ed0183448

Wodann commented 3 years ago

I've been waiting for some other people in the WG to respond to this issue, because I didn't fully comprehend the need for a solution to the described problem.

As no one has responded yet, I'll give my two cents. As far as I understand, the goal of the AllocRef API is to provide a low-level allocator API that operates on bytes. External crates are free to implement type-based APIs, but it is unnecessary for it to be part of std.