Open TimDiekmann opened 4 years ago
I feel that this is something that we can always add later if needed. I don't think there is a solid use case for this at the moment.
While the use cases for Box
are rather limited, Rc
and Arc
makes allocators sharable without cloning. Before, this wasn't possible because an allocator needed to be mutable, but as this has changed, this proposal should be implemented at least for Rc
and Arc
. I'd also implement it for Box
for convenience.
Implementing AllocRef
on Box
will also allow for owned polymorphic allocators, as described in #83, without the cost of reference counting
@Amanieu:
I don't think there is a solid use case for this at the moment.
I am currently using core::alloc::Allocator
to implement a regex engine with a C ABI interface that funnels all dynamic allocations through a CallbackAllocator
so that the client application can easily track and distinguish allocations performed in separate parts of the regex engine (e.g. parsing the pattern string vs executing an nfa/lazy dfa). I am hoping to propose this for use in emacs (see emacs-devel
discussion).
My CallbackAllocator
that C clients can pass to my methods looks like this, and takes up 24 bytes:
#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)]
#[repr(C)]
pub struct CallbackAllocator {
ctx: Option<NonNull<c_void>>,
alloc: Option<unsafe extern "C" fn(Option<NonNull<c_void>>, usize) -> Option<NonNull<c_void>>>,
free: Option<unsafe extern "C" fn(Option<NonNull<c_void>>, NonNull<c_void>) -> ()>,
}
While the use cases for Box are rather limited, Rc and Arc makes allocators sharable without cloning.
As per @TimDiekmann's comment here, I believe there is a use case for Rc
/Arc
. Currently, when I parse a regexp pattern string into an AST, my top-level Expr
enum allocates a Box
for each level of recursion (I believe this is relatively idiomatic):
pub enum Expr<L, A>
where
L: LiteralEncoding,
A: Allocator,
{
/// `a`
SingleLiteral(SingleLiteral<L::Single>),
/// `\\` or `\+`
EscapedLiteral(Escaped<L::Single>),
/// `\<1-9>`
Backref(Backref),
/// `^` or `$`
Anchor(Anchor),
/// `[a-z]` or `\w` or `\s-`
CharSelector(SingleCharSelector<L::Single, A>),
/// `<expr><op>`
Postfix {
inner: Box<Expr<L, A>, A>,
op: PostfixOp,
},
/// `(<expr>)`
Group {
kind: GroupKind,
inner: Box<Expr<L, A>, A>,
},
/// `<expr>\|<expr>`
Alternation { cases: Vec<Box<Expr<L, A>, A>, A> },
/// `<expr><expr>`
Concatenation {
components: Vec<Box<Expr<L, A>, A>, A>,
},
}
However, I believe this also means I need to clone my 24-byte CallbackAllocator
for each level of recursion, and if I expand the CallbackAllocator
C ABI interface later to support more complex functionality with additional fields, I believe each level of my parse tree is going to take up additional bytes, even though my parse tree is only logically sharing a single allocator.
To solve this problem right now, I'm going to try creating an intermediate pub struct SharedAllocator(pub Arc<CallbackAllocator, CallbackAllocator>);
and manually implement core::alloc::Allocator
for it. If that works (and allows me to share an allocator reference that doesn't scale with the size of my allocator struct), then I think there could be an argument for exposing a core::alloc::Allocator
impl for Rc
/Arc
that forwards to the underlying allocator reference.
Please feel free to let me know if I'm doing anything incorrect. Relying on a reference via Allocator::by_ref()
actually compiles and runs, but produces a reference to a stack variable that doesn't remain valid across C ABI method calls. I believe Rc
/Arc
is probably the right way to address this issue to get an shareable allocator reference in heap memory.
Hmmm, it looks like I might have misjudged, and I think Arc<A: Allocator>
implementing Allocator
would not help for my use case, since it looks like each ::alloc::sync::Arc
instance also retains a whole copy of the allocator (keeping the extra 24 bytes I mentioned above):
pub struct Arc<
T: ?Sized,
#[unstable(feature = "allocator_api", issue = "32838")] A: Allocator = Global,
> {
ptr: NonNull<ArcInner<T>>,
phantom: PhantomData<ArcInner<T>>,
alloc: A,
}
For my problem, I've simply made an unsafe allocator struct that keeps track of a raw pointer (see https://github.com/cosmicexplorer/emacs-regexp/commit/22aabb92ca08139e04c24e52cb2bc4467d442a25):
#[derive(Debug, Copy, Clone)]
#[repr(transparent)]
pub struct SharedAllocator(pub *const CallbackAllocator);
static_assertions::assert_eq_size!(usize, SharedAllocator);
impl SharedAllocator {
pub fn from_alloc(
alloc: CallbackAllocator,
) -> (Self, NonNull<CallbackAllocator>, CallbackAllocator) {
let boxed_alloc: Box<CallbackAllocator, CallbackAllocator> = Box::new_in(alloc, alloc);
let (boxed_alloc, alloc) = Box::into_raw_with_allocator(boxed_alloc);
let boxed_alloc: NonNull<CallbackAllocator> = unsafe { NonNull::new_unchecked(boxed_alloc) };
(
Self(unsafe { mem::transmute(boxed_alloc) }),
boxed_alloc,
alloc,
)
}
}
I'm sorry for the noise, but I think my problem is not suited for this issue. Thanks!
The "most optimal" design would probably be to only have a single copy of the allocator stored at the AST root, and pass it down as necessary. Unfortunately, this is fundamentally unsafe and isn't super compatible with dtor cleanup, as now cleanup requires access to external state.
In this case, if you have the ability to break ABI, I'd have the C client create the vtable structure and give you (void *ctx, struct alloc_vtbl *vtbl)
, or have the client tag on any of their own state after a shared initial prefix for the vtable. Both resolve the size concern at the source instead of papering over it with another allocation and ime are reasonably common C idioms.
But, generally, when &impl ?Sized + Trait
gets an impl
, I personally do think it's usually a good idea to also provide implementations for the other valid receiver types as well.
Maybe don't trust this as it's entirely untested but this scratched an itch: an Arc
which is allocated in the allocator which it owns:
pub struct AllocArc<A: Allocator> {
inner: NonNull<MaybeUninit<A>>,
}
unsafe impl<A: Allocator> Send for AllocArc<A> where for<'a> Arc<A, &'a A>: Send {}
unsafe impl<A: Allocator> Sync for AllocArc<A> where for<'a> Arc<A, &'a A>: Sync {}
impl<A: Allocator> Drop for AllocArc<A> {
fn drop(&mut self) {
// SAFETY: questionable
unsafe {
let alloc: &A = self;
let this = ManuallyDrop::new(Arc::from_raw_in(self.inner.as_ptr(), alloc));
if Arc::strong_count(&this) == 1 {
assert_eq!(Arc::weak_count(&this), 1);
// move alloc to stack to drop it *after* freeing the Arc
let ref alloc: A = self.inner.as_ref().assume_init_read();
Arc::decrement_strong_count_in(self.inner.as_ptr(), alloc);
} else {
Arc::decrement_strong_count_in(self.inner.as_ptr(), alloc);
}
}
}
}
impl<A: Allocator> Clone for AllocArc<A> {
fn clone(&self) -> Self {
// SAFETY: questionable
unsafe {
let alloc: &A = self;
Arc::increment_strong_count_in(self.inner.as_ptr(), alloc);
}
Self { ..*self }
}
}
impl<A: Allocator> AllocArc<A> {
pub fn new(alloc: A) -> Self {
let this = Arc::into_raw(Arc::new_in(MaybeUninit::uninit(), &alloc));
let mut inner = NonNull::new(this.cast_mut()).unwrap();
unsafe { inner.as_mut().write(alloc) };
Self { inner }
}
}
impl<A: Allocator> Deref for AllocArc<A> {
type Target = A;
fn deref(&self) -> &A {
unsafe {
self.inner.as_ref().assume_init_ref()
}
}
}
unsafe impl<A: Allocator> Allocator for AllocArc<A> {
fn allocate(&self, layout: Layout) -> Result<NonNull<[u8]>, AllocError> {
(**self).allocate(layout)
}
unsafe fn deallocate(&self, ptr: NonNull<u8>, layout: Layout) {
(**self).deallocate(ptr, layout)
}
fn allocate_zeroed(&self, layout: Layout) -> Result<NonNull<[u8]>, AllocError> {
(**self).allocate_zeroed(layout)
}
unsafe fn grow(
&self,
ptr: NonNull<u8>,
old_layout: Layout,
new_layout: Layout,
) -> Result<NonNull<[u8]>, AllocError> {
(**self).grow(ptr, old_layout, new_layout)
}
unsafe fn grow_zeroed(
&self,
ptr: NonNull<u8>,
old_layout: Layout,
new_layout: Layout,
) -> Result<NonNull<[u8]>, AllocError> {
(**self).grow_zeroed(ptr, old_layout, new_layout)
}
unsafe fn shrink(
&self,
ptr: NonNull<u8>,
old_layout: Layout,
new_layout: Layout,
) -> Result<NonNull<[u8]>, AllocError> {
(**self).shrink(ptr, old_layout, new_layout)
}
}
This could probably work if you allow the Arc
to just use the global allocator. If open a PR in rust-lang/rust to add impl AllocRef for Arc<A1, A2>
(same for Rc
) where both A1
and A2
are allocators then I would be happy to review and merge them.
As requested in #53 and implemented in rust-lang/rust#71442,
AllocRef
is implemented on mutable references. Does anything stop us from implement it onBox
,Rc
, orArc
?