Implement `AllocRef` on `Box`, `Rc`, and `Arc`

TimDiekmann commented 4 years ago

As requested in #53 and implemented in rust-lang/rust#71442, AllocRef is implemented on mutable references. Does anything stop us from implement it on Box, Rc, or Arc?

Amanieu commented 4 years ago

I feel that this is something that we can always add later if needed. I don't think there is a solid use case for this at the moment.

TimDiekmann commented 3 years ago

While the use cases for Box are rather limited, Rc and Arc makes allocators sharable without cloning. Before, this wasn't possible because an allocator needed to be mutable, but as this has changed, this proposal should be implemented at least for Rc and Arc. I'd also implement it for Box for convenience.

RustyYato commented 3 years ago

Implementing AllocRef on Box will also allow for owned polymorphic allocators, as described in #83, without the cost of reference counting

cosmicexplorer commented 2 months ago

@Amanieu:

I don't think there is a solid use case for this at the moment.

I am currently using core::alloc::Allocator to implement a regex engine with a C ABI interface that funnels all dynamic allocations through a CallbackAllocator so that the client application can easily track and distinguish allocations performed in separate parts of the regex engine (e.g. parsing the pattern string vs executing an nfa/lazy dfa). I am hoping to propose this for use in emacs (see emacs-devel discussion).

My CallbackAllocator that C clients can pass to my methods looks like this, and takes up 24 bytes:

#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)]
#[repr(C)]
pub struct CallbackAllocator {
  ctx: Option<NonNull<c_void>>,
  alloc: Option<unsafe extern "C" fn(Option<NonNull<c_void>>, usize) -> Option<NonNull<c_void>>>,
  free: Option<unsafe extern "C" fn(Option<NonNull<c_void>>, NonNull<c_void>) -> ()>,
}

While the use cases for Box are rather limited, Rc and Arc makes allocators sharable without cloning.

As per @TimDiekmann's comment here, I believe there is a use case for Rc/Arc. Currently, when I parse a regexp pattern string into an AST, my top-level Expr enum allocates a Box for each level of recursion (I believe this is relatively idiomatic):

  pub enum Expr<L, A>
  where
    L: LiteralEncoding,
    A: Allocator,
  {
    /// `a`
    SingleLiteral(SingleLiteral<L::Single>),
    /// `\\` or `\+`
    EscapedLiteral(Escaped<L::Single>),
    /// `\<1-9>`
    Backref(Backref),
    /// `^` or `$`
    Anchor(Anchor),
    /// `[a-z]` or `\w` or `\s-`
    CharSelector(SingleCharSelector<L::Single, A>),
    /// `<expr><op>`
    Postfix {
      inner: Box<Expr<L, A>, A>,
      op: PostfixOp,
    },
    /// `(<expr>)`
    Group {
      kind: GroupKind,
      inner: Box<Expr<L, A>, A>,
    },
    /// `<expr>\|<expr>`
    Alternation { cases: Vec<Box<Expr<L, A>, A>, A> },
    /// `<expr><expr>`
    Concatenation {
      components: Vec<Box<Expr<L, A>, A>, A>,
    },
  }

However, I believe this also means I need to clone my 24-byte CallbackAllocator for each level of recursion, and if I expand the CallbackAllocator C ABI interface later to support more complex functionality with additional fields, I believe each level of my parse tree is going to take up additional bytes, even though my parse tree is only logically sharing a single allocator.

To solve this problem right now, I'm going to try creating an intermediate pub struct SharedAllocator(pub Arc<CallbackAllocator, CallbackAllocator>); and manually implement core::alloc::Allocator for it. If that works (and allows me to share an allocator reference that doesn't scale with the size of my allocator struct), then I think there could be an argument for exposing a core::alloc::Allocator impl for Rc/Arc that forwards to the underlying allocator reference.

Please feel free to let me know if I'm doing anything incorrect. Relying on a reference via Allocator::by_ref() actually compiles and runs, but produces a reference to a stack variable that doesn't remain valid across C ABI method calls. I believe Rc/Arc is probably the right way to address this issue to get an shareable allocator reference in heap memory.

cosmicexplorer commented 2 months ago

Hmmm, it looks like I might have misjudged, and I think Arc<A: Allocator> implementing Allocator would not help for my use case, since it looks like each ::alloc::sync::Arc instance also retains a whole copy of the allocator (keeping the extra 24 bytes I mentioned above):

pub struct Arc<
    T: ?Sized,
    #[unstable(feature = "allocator_api", issue = "32838")] A: Allocator = Global,
> {
    ptr: NonNull<ArcInner<T>>,
    phantom: PhantomData<ArcInner<T>>,
    alloc: A,
}

For my problem, I've simply made an unsafe allocator struct that keeps track of a raw pointer (see https://github.com/cosmicexplorer/emacs-regexp/commit/22aabb92ca08139e04c24e52cb2bc4467d442a25):

#[derive(Debug, Copy, Clone)]
#[repr(transparent)]
pub struct SharedAllocator(pub *const CallbackAllocator);
static_assertions::assert_eq_size!(usize, SharedAllocator);

impl SharedAllocator {
  pub fn from_alloc(
    alloc: CallbackAllocator,
  ) -> (Self, NonNull<CallbackAllocator>, CallbackAllocator) {
    let boxed_alloc: Box<CallbackAllocator, CallbackAllocator> = Box::new_in(alloc, alloc);
    let (boxed_alloc, alloc) = Box::into_raw_with_allocator(boxed_alloc);
    let boxed_alloc: NonNull<CallbackAllocator> = unsafe { NonNull::new_unchecked(boxed_alloc) };
    (
      Self(unsafe { mem::transmute(boxed_alloc) }),
      boxed_alloc,
      alloc,
    )
  }
}

I'm sorry for the noise, but I think my problem is not suited for this issue. Thanks!

CAD97 commented 2 months ago

The "most optimal" design would probably be to only have a single copy of the allocator stored at the AST root, and pass it down as necessary. Unfortunately, this is fundamentally unsafe and isn't super compatible with dtor cleanup, as now cleanup requires access to external state.

In this case, if you have the ability to break ABI, I'd have the C client create the vtable structure and give you (void *ctx, struct alloc_vtbl *vtbl), or have the client tag on any of their own state after a shared initial prefix for the vtable. Both resolve the size concern at the source instead of papering over it with another allocation and ime are reasonably common C idioms.

But, generally, when &impl ?Sized + Trait gets an impl, I personally do think it's usually a good idea to also provide implementations for the other valid receiver types as well.

CAD97 commented 2 months ago

Maybe don't trust this as it's entirely untested but this scratched an itch: an Arc which is allocated in the allocator which it owns:

pub struct AllocArc<A: Allocator> {
    inner: NonNull<MaybeUninit<A>>,
}

unsafe impl<A: Allocator> Send for AllocArc<A> where for<'a> Arc<A, &'a A>: Send {}
unsafe impl<A: Allocator> Sync for AllocArc<A> where for<'a> Arc<A, &'a A>: Sync {}

impl<A: Allocator> Drop for AllocArc<A> {
    fn drop(&mut self) {
        // SAFETY: questionable
        unsafe {
            let alloc: &A = self;
            let this = ManuallyDrop::new(Arc::from_raw_in(self.inner.as_ptr(), alloc));
            if Arc::strong_count(&this) == 1 {
                assert_eq!(Arc::weak_count(&this), 1);
                // move alloc to stack to drop it *after* freeing the Arc
                let ref alloc: A = self.inner.as_ref().assume_init_read();
                Arc::decrement_strong_count_in(self.inner.as_ptr(), alloc);
            } else {
                Arc::decrement_strong_count_in(self.inner.as_ptr(), alloc);
            }
        }
    }
}

impl<A: Allocator> Clone for AllocArc<A> {
    fn clone(&self) -> Self {
        // SAFETY: questionable
        unsafe {
            let alloc: &A = self;
            Arc::increment_strong_count_in(self.inner.as_ptr(), alloc);
        }
        Self { ..*self }
    }
}

impl<A: Allocator> AllocArc<A> {
    pub fn new(alloc: A) -> Self {
        let this = Arc::into_raw(Arc::new_in(MaybeUninit::uninit(), &alloc));
        let mut inner = NonNull::new(this.cast_mut()).unwrap();
        unsafe { inner.as_mut().write(alloc) };
        Self { inner }
    }
}

impl<A: Allocator> Deref for AllocArc<A> {
    type Target = A;
    fn deref(&self) -> &A {
        unsafe {
            self.inner.as_ref().assume_init_ref()
        }
    }
}

unsafe impl<A: Allocator> Allocator for AllocArc<A> {
    fn allocate(&self, layout: Layout) -> Result<NonNull<[u8]>, AllocError> {
        (**self).allocate(layout)
    }

    unsafe fn deallocate(&self, ptr: NonNull<u8>, layout: Layout) {
        (**self).deallocate(ptr, layout)
    }

    fn allocate_zeroed(&self, layout: Layout) -> Result<NonNull<[u8]>, AllocError> {
        (**self).allocate_zeroed(layout)
    }

    unsafe fn grow(
        &self,
        ptr: NonNull<u8>,
        old_layout: Layout,
        new_layout: Layout,
    ) -> Result<NonNull<[u8]>, AllocError> {
        (**self).grow(ptr, old_layout, new_layout)
    }

    unsafe fn grow_zeroed(
        &self,
        ptr: NonNull<u8>,
        old_layout: Layout,
        new_layout: Layout,
    ) -> Result<NonNull<[u8]>, AllocError> {
        (**self).grow_zeroed(ptr, old_layout, new_layout)
    }

    unsafe fn shrink(
        &self,
        ptr: NonNull<u8>,
        old_layout: Layout,
        new_layout: Layout,
    ) -> Result<NonNull<[u8]>, AllocError> {
        (**self).shrink(ptr, old_layout, new_layout)
    }
}

Amanieu commented 2 months ago

This could probably work if you allow the Arc to just use the global allocator. If open a PR in rust-lang/rust to add impl AllocRef for Arc<A1, A2> (same for Rc) where both A1 and A2 are allocators then I would be happy to review and merge them.

rust-lang / wg-allocators

Implement `AllocRef` on `Box`, `Rc`, and `Arc` #54