Change AllocRef to take &self

Amanieu commented 4 years ago

This has several advantages:

AllocRef can now be used by concurrent (lock-free) data structures which only have &self operations. Currently the allocator would need to be wrapped in a mutex even if the allocator is thread-safe.
Allocators can directly expose whether they are thread-safe or not with Sync. This bound is inherited by data structures using an allocator: a lock-free queue would only be Sync if its allocator is Sync.
We can define a blanket implementation of AllocRef for all types implementing GlobalAlloc. This would allow us to change #[global_allocator] to require AllocRef instead of GlobalAlloc. This is not possible if AllocRef methods take &mut self.

TimDiekmann commented 4 years ago

I find the first point in particular very important. I see the second point as a side effect, because data structures that accept &mut self are not affected by it, but together with point 1 it is still very interesting! I'm sure we could get #[global_allocator] work somehow with AllocRef but implementing it this way is by far the most intuitive and easiest solution.

How does the last point plays together with #43? System and GlobalAlloc are stable, impl<A: GlobalAlloc> AllocRef for A would prohibit impl AllocRef for System. Is specialization an option here?

Amanieu commented 4 years ago

We remove the existing impl GlobalAlloc for System. We add impl AllocRef for System. The blanket impl will take care of providing GlobalAlloc for System via the AllocRef impl.

TimDiekmann commented 4 years ago

The blanket impl will take care of providing GlobalAlloc for System via the AllocRef impl.

I thought we impl<A: GlobalAlloc> AllocRef for A, not impl<A: AllocRef> GlobalAlloc for A?

Amanieu commented 4 years ago

Ah, you're right, I got mixed up. I think we can make this work with specialization, but I'll need to double-check.

TimDiekmann commented 4 years ago

#![feature(specialization)]

trait GlobalAlloc {
    fn global_alloc(&self) {}
}
trait AllocRef {
    fn alloc_ref(&self) {}
}

impl<A: GlobalAlloc> AllocRef for A {}

struct System;
impl GlobalAlloc for System {}

impl AllocRef for System {}

fn main() {
    System.global_alloc();
    System.alloc_ref();
}

This works, even without the default keyword, but I'm not that deep into specialization. However it fails to compile without #![feature(specialization]

EDIT: Actually this is very nice, as we can "specialize" System implementation if there is a performance gain, otherwise we can rely on the blanket implementation.

RustyYato commented 4 years ago

This works because there is a default implementation in the trait, it doesn't forward the from AllocRef to GlobalAlloc.

However, this works even with the sound sub-set of specialization. (min_specialization)

#![feature(min_specialization)]

trait GlobalAlloc {
    fn global_alloc(&self); // no default implementation in the trait
}
trait AllocRef {
    fn alloc_ref(&self); // no default implementation in the trait
}

impl<A: GlobalAlloc> AllocRef for A {
    default fn alloc_ref(&self) { A::global_alloc(self) }
}

struct System;
impl GlobalAlloc for System {
    fn global_alloc(&self) {}
}

impl AllocRef for System {}

fn main() {
    System.global_alloc();
    System.alloc_ref();
}

TimDiekmann commented 4 years ago

We can define a blanket implementation of AllocRef for all types implementing GlobalAlloc. This would allow us to change #[global_allocator] to require AllocRef instead of GlobalAlloc. This is not possible if AllocRef methods take &mut self.

I think you mixed this up as well, as we call &self from &mut self:

unsafe impl<A: GlobalAlloc> AllocRef for A {
    fn alloc(&mut self, layout: Layout, init: AllocInit) -> Result<MemoryBlock, AllocErr> {
         // ZST checks etc.
        let raw_ptr = match init {
            AllocInit::Uninitialized => GlobalAlloc::alloc(self, layout), // Takes &self
            AllocInit::Zeroed => GlobalAlloc::alloc_zeroed(self, layout), // Takes &self
        };
        // error handling
    }
   // ...
}

However, this works even with the sound sub-set of specialization. (min_specialization)

Good to know it's working somehow :slightly_smiling_face:

TimDiekmann commented 4 years ago

AllocRef can now be used by concurrent (lock-free) data structures which only have &self operations. Currently the allocator would need to be wrapped in a mutex even if the allocator is thread-safe.

This is not the most elegant solution, but it's working:

struct MyAlloc;
unsafe impl AllocRef for &MyAlloc {
    fn alloc(&mut self, _: Layout, _: AllocInit) -> Result<MemoryBlock, AllocErr> {
        println!("alloc");
        Err(AllocErr)
    }
    unsafe fn dealloc(&mut self, _: NonNull<u8>, _: Layout) {}
}

struct DataStructure<A> {
    alloc: A,
}

impl<A> DataStructure<A>
where
    for<'a> &'a A: AllocRef,
{
    fn use_alloc(&self) {
        (&self.alloc).alloc(Layout::new::<u32>(), AllocInit::Zeroed);
    }
}

fn main() {
    let s = DataStructure { alloc: MyAlloc };
    s.use_alloc();
}

Playground

Wodann commented 4 years ago

I remembered a prior discussion about moving away from &mut self that resulted in the conclusion that &mut self is required

TimDiekmann commented 4 years ago

I remembered a prior discussion about moving away from &mut self that resulted in the conclusion that &mut self is required

At least it is required to stay at a reference level, it's not necessarily required to stick the mut.

I like to summarize the (dis)advantages for &self and &mut self to have a full list at one place (no specific order, non-exhaustive).

	`&self`	`&mut self`
Changing the internal state	requires some sort of internal mutability (either `Sync` or `!Sync`), atomic ordering may be tricky	trivial
Multithreading	Depends on internal mutability type	implicit possible
Usage on data structures taking `&self`	trivial	requires `AllocRef` to be implemented on `&MyAlloc` and a `for<'a> &'a A` bound
Sharing (e.g. via `Rc` and `Arc`)	implicit and explicit (`by_ref`) possible	requires `AllocRef` to be implemented on `&MyAlloc`
*Performance of unshared* allocators used repeatedly in a linear fashion**	May introduce overhead due to internal mutability	No impact for unshared allocators

I think both types are more or less equal. For &self one have to choose the type of internal mutability in any case, which might be tricky. This probably results in a version using Cell for !Sync and in a version using atomics for Sync. For sharing &mut self allocators this also applies, but for unshared allocators the &mut self variant implementation is straight forward. However the ergonomic of &MyAlloc might be unintuitive.

To conclude: We need internal mutability, we only have to decide when it is required:

&self: Required when internal states is modified
&mut self: Required when sharing

SimonSapin commented 4 years ago

AllocRef can now be used by concurrent (lock-free) data structures which only have &self operations.

Isn’t this already possibly through impl AllocRef for &MyLockFreeAllocator, impl AllocRef for Arc<MyLockFreeAllocator> etc?

I thought that was the point of naming the trait AllocRef rather than Alloc.

lachlansneff commented 4 years ago

Yeah, if it's a reference anyhow, won't we always have access to an owned instance of the reference, as confusing as that sounds?

Amanieu commented 4 years ago

I still think that overall ergonomics are better if we change methods to take &self. Consider this code which needs to allocate memory with &self (e.g. a concurrent data structure):

struct Foo<'a, A>
where
    &'a A: AllocRef,
{
    alloc: &'a A,
}

impl<'a, A> Foo<'a, A>
where
    &'a A: AllocRef,
{
    fn alloc_without_mut(&self) {
        // Can't mutate the reference, we need to make a mutable copy of it.
        let mut a = self.alloc;
        a.alloc(Layout::new::<i32>());
    }
}

The version with AllocRef methods taking &self is much cleaner:

struct Bar<A: AllocRef> {
    alloc: A,
}

impl<A: AllocRef> Foo<A> {
    fn alloc_without_mut(&self) {
        self.alloc.alloc(Layout::new::<i32>());
    }
}

I don't think that the need for inner mutability in allocator implementations is a big issue: in practice every serious allocator will need to use it (even with &mut self) to support cloning AllocRef or to implement AllocRef for &'a MyAlloc.

Wodann commented 4 years ago

I don't have any objections against this. @TimDiekmann did you previously experiment with a similar API?

TimDiekmann commented 4 years ago

That's the GlobalAlloc approach and obviously &mut self is easier to implement if you don't need to have &self version. However I like to come back to an old Idea, came up again in #71: Change AllocRef to take &self, add AllocRefMut which takes &mut self, and provide a blanket implementation for AllocRefMut where A: AllocRef. This way we would have the best of both worlds.

lachlansneff commented 4 years ago

AllocRefMut isn't quite an idiomatic name, it probably should be AllocRef (for &self) and AllocMut for (&mut self), but since the mutable one will probably be much more common, I'd suggest just AllocRef and SyncAllocRef (since the point is to be able to make an allocator that is Sync+Send) or Alloc and SyncAlloc.

TimDiekmann commented 4 years ago

The allocator trait should not require Send or Sync and thus not be named after it.

lachlansneff commented 4 years ago

Fair enough, a &self allocator that isn't thread-safe could be pretty common too. With that in mind, AllocRef and AllocMut would be the idiomatic names.

Amanieu commented 4 years ago

In fact BumpAlo is an example of a &self allocator that isn't thread-safe.

but since the mutable one will probably be much more common,

I disagree: I expect almost all allocators to support &self because you need that to support sharing an allocator between multiple data structures.

lachlansneff commented 4 years ago

Okay, perfect then. Most allocators AllocRefwould impl AllocRef for Allocator and a few would impl AllocMut for Allocator.

I guess we could then have blanket impl<A: AllocRef> AllocMut for A, impl<A: AllocRef> AllocRef for &mut A, impl<A: AllocRef> AllocRef for Rc<A>, impl<A: AllocRef> AllocRef for Arc<A>, and probably a few others.

And most collections would take an AllocMut. If they needed to be concurrent, they'd take AllocRef + Send + Sync.

Amanieu commented 4 years ago

I personally see little value in AllocMut considering that all allocators are going to support AllocRef anyways. Is there any allocator that absolutely can't be implemented using just AllocRef?

I am particularly worried that we are doubling the API complexity for this.

lachlansneff commented 4 years ago

Yeah, hmm. I can think of allocators that would be easier to implement with AllocMut, but nothing that's impossible.

TimDiekmann commented 4 years ago

I think we should ask if there are allocators, which may be slower using &self. Any logic could be wrapped in a mutex or similar.

lachlansneff commented 4 years ago

I would think that allocator implementers could use Cell. It's a pain to use sometimes, but it should work for pretty much everything, except if they're wrapping a type that requires mutability. In that case, they can use UnsafeCell.

Amanieu commented 4 years ago

I would expect any allocator that doesn't natively support multi-threading (via thread-local caches) to be !Sync and use some form of Cell internally. Such an allocator can still be shared across threads if the user wraps it in a Mutex or ThreadLocal, but that is obviously suboptimal.

TimDiekmann commented 4 years ago

As @Amanieu pointed out in https://github.com/rust-lang/wg-allocators/issues/55#issuecomment-683315523 I think it's obvious, that we don't want AllocRef to take &mut self, as it's hard to use. In fact, I believe that the pattern where &A: Trait is very little used and we should take care that the application of the API should be as simple as possible, if necessary at the expense of implementing an allocator, since we can assume that the API is used more often than implemented. As a user of the API I expect to just pull a crate, throw in an allocator and it should work just like in any other API. As far as I can see, the only unresolved question is, if &mut self is needed for an allocator to perform well. Just like BuildAllocRef (#12), adding AllocRefMut (without having a strong opinion on the name) could be added in a backwards compatible way. I think we should proceed and change the signature to &self for the sake of usability.

blitzerr commented 4 years ago

@rustbot claim

rustbot commented 4 years ago

Error: This repository is not enabled to use triagebot. Add a triagebot.toml in the root of the master branch to enable it.

Please let @rust-lang/release know if you're having trouble with this bot.

haberman commented 2 years ago

Hi there, I recently came across this issue when I was trying to understand why Bumpalo uses &self instead of &mut self for allocation. I wrote a blog article about my experiences trying out Bumpalo: https://blog.reverberate.org/2021/12/19/arenas-and-rust.html

For my use case, a noticeable downside of &self combined with a !Sync allocator like Bumpalo is that it makes allocator-aware structures !Send, even though Bumpalo itself is Send and even when I am packing it into the same self-referential struct as the references.

One thing I didn't quite understand when reading this issue:

requires AllocRef to be implemented on &MyAlloc and a for<'a> &'a A bound

What does this look like? What are the benefits/limitations of this approach? I'm still somewhat new to Rust, so I don't quite follow what it means to implement a trait on a reference, or what this bound means.

Amanieu commented 2 years ago

@haberman I read through your blog post. There is another significant issue with the SyncBump interface other than the inability to stash away multiple references to the allocator.

It's a bit subtle since it's hidden by lifetime inference but becomes more obvious if you expand it:

pub fn alloc<'a, T>(&'a mut self, val: T) -> &'a mut { ... }

Because the returned lifetime is tied to a mutable borrow of self, this will continue to mutably borrow the allocator as long as the returned reference exists. Effectively, this will prevent you from having multiple references to allocations in the arena at the same time. Only allowing a single allocation at a time makes an arena a lot less useful.

haberman commented 2 years ago

@Amanieu Thank you for reading and for the correction! I received this correction on Twitter also but didn't have a chance to update the article yet.

I'm having trouble understanding how AllocRef gets around this problem. Matt explained this to me here but I'm not quite following.

SimonSapin commented 2 years ago

std::alloc::Allocator::allocate currently returns Result<NonNull<[u8]>, AllocError>. That return type has no lifetime parameter, even inferred, unlike SyncBump::alloc(&'a mut self) -> &'a mut T where the lifetime 'a makes the return type borrow from self.

zakarumych commented 2 years ago

std::alloc::Allocator::allocate requires allocated memory to stay allocated as long as std::alloc::Allocator implementation is alive, or deallocate is called.

To fulfill this requirement std::alloc::Allocator must be implemented for a reference to the arena (lets call it Arena), that guarantees allocations are not get reused. This can only work with &Arena with Arena::reset taking &mut self.

rust-lang / wg-allocators

Change AllocRef to take &self #55