Separate `dealloc` from `Alloc` into other trait

TimDiekmann commented 5 years ago

Most (all?) of the structs mentioned in #7 only needs the dealloc method for Drop. It'd may be useful to split up Alloc into two traits. We didn't came up however with the exact layout and relationship with those two traits. So far, those possibilities showed up:

Make it a supertrait: trait Alloc: Dealloc { ... } (https://github.com/rust-lang/wg-allocators/issues/9#issuecomment-490628944)
~Associate the Alloc trait: trait Dealloc { type Alloc: Alloc; ... }~
Introduce a trait trait GetDealloc { unsafe fn get_dealloc() -> ???; }
Same as above, but get_alloc as method in Alloc instead of an extra trait (https://github.com/rust-lang/wg-allocators/issues/9#issuecomment-490630584)
Also split realloc into Realloc and associate Dealloc with Alloc and Realloc (https://github.com/rust-lang/wg-allocators/issues/9#issuecomment-538592891)
- Also associate Alloc with Realloc (https://github.com/rust-lang/wg-allocators/issues/9#issuecomment-490650773)
Even more complicated hierarchies (https://github.com/rust-lang/wg-allocators/issues/9#issuecomment-490633169)

Edits

2019/Oct/05: Reflect the threads current solution proposals.

glandium commented 5 years ago

Other possibility:

trait Dealloc { ... } impl<T: Alloc> Dealloc for T { ... }, and change the relevant bounds to Dealloc.

SimonSapin commented 5 years ago

only needs the dealloc method for dropping

This is not quite true. Other APIs like Box::clone or Rc::make_mut may need to allocate.

I think it’s important for this issue to provide some context and motivation.

Current proposals revolve around adding an A: Alloc type parameter to types such as Box<T>, Vec<T>, etc; and storing a value of type A inline in those structs. For "traditional" allocators like jemalloc that are process-global / singleton, A can be a zero-sized type. However for allocators that might have multiple "instances", A needs to be a handle like &_ or Arc<_> in order to associate each collection value with its corresponding allocator instance. This means e.g. doubling size_of for Box, which has non-trivial cost.

This issue is about reducing this cost in a narrow set of circumstances:

The allocator has multiple instances, so allocating requires a non-zero-size handle
Deallocation is a no-op (for example in a simple bump allocator) or otherwise doesn’t require a handle that point to the allocator instance.
And the user is willing to give up on APIs like Box::clone, such that the box only ever needs to know how to deallocate, never allocate.

In that case we could in theory have zero-size deallocation-only handles to keep in Box<T, A>, in order to keep it small.

SimonSapin commented 5 years ago

So far I haven’t seen a complete proposal of what an API supporting this use case might look like. It’s not just the trait:

Giving up on Clone and friends needs to be a opt-in choice, so there needs to be dedicated APIs on collections in any case. Does that mean e.g. Box::new_dealloc_only_in in addition to Box::new_in? What’s the signature?
Before we can get a deallocation-only handle that is appropriate for some allocation, that allocation needs to have been allocated at some point. Presumably with a “full” handle. Does that mean that a “full” handle knows how to downgrade itself to deallocation-only? What’s the API for that?

Before we accept it as a goal to support this use case, I’d like someone who wants it to come up with a more comprehensive API proposal. That should be the starting point of the discussion.

But if this adds significant complexity to the type signature even for users who do not use this feature, I’m not sure we should accept such a narrow use case.

TimDiekmann commented 5 years ago

This is not quite true. Other APIs like Box::clone or Rc::make_mut may need to allocate.

I'm sorry, I think I expressed myself misunderstandably. The struct itself only needs Dealloc as bound as Drop only needs dealloc. Things like Box::clone could bind A: Alloc + Dealloc.

scottjmaddox commented 5 years ago

Yeah, the impl<T> for Box (and other collections) would just need to be split into impl<T, A: Dealloc> and impl<T, A: Alloc+Dealloc>. You wouldn't need a separate new_dealloc_only_in.

SimonSapin commented 5 years ago

I don’t understand. If new_dealloc_only_in is not needed, please provide the full signatures you would expect for the Box type, the constructor, and the destructor. In particular, how is the allocation owned by Box<T, A: Dealloc> created?

TimDiekmann commented 5 years ago

Not a signature, but with from_raw_in it would be possible. It's rather lowlevel but for complex data structures this might makes sense.

scottjmaddox commented 5 years ago

You're right, my previous suggestion was incorrect. However, it can be done like this (unless I'm missing something):

struct Box<T: ?Sized, D>(Unique<T>, D);

impl<T: ?Sized, D: Dealloc> Drop for Box<T, D> {
    fn drop(&mut self);
}

impl<T: ?Sized, D: Dealloc, A: Alloc<Dealloc=D>> Box<T, D> {
    fn new_in(x: T, a: A) -> Box<T, D>;
}

SimonSapin commented 5 years ago

@TimDiekmann So the only way to use this feature would require unsafe code?

@scottjmaddox So we’d have an Alloc::downgrade(self) -> Self::Dealloc method, and the choice of giving up on Box::clone or not (A = D) would be based on using a different allocator type?

scottjmaddox commented 5 years ago

Yes, you would need something like Alloc::downgrade(self) -> Self::Dealloc; perhaps just Alloc:get_dealloc(&self) -> Self::Dealloc. And as you say, some allocators would provide a type that implements Alloc + Dealloc instead of just Dealloc, and the former would impl Box::clone.

Ideally, there would be additional methods like Box::clone_in that accept an Alloc argument.

gnzlbg commented 5 years ago

So the only way to use this feature would require unsafe code?

Yes. I don't know if the analogy helps, but have you used C++'s std::unique_ptr ? It only needs a "custom deleter" to free itself on destruction. The std::unique_ptr itself is "move only", and cannot be implicitly cloned (has no copy constructor/assignment).

IIUC what's being proposed here is the same. Box<T, A: Dealloc> is the bound on the type. This is useful, e.g., because you don't necessarily need to construct a Box via Box::new, you can also construct a Box from a raw pointer, e.g., coming from FFI (e.g. from a C++ unique_ptr).

Most of the Box functionality would just be impl<T, A: Dealloc> for Box<T, A> { ... }. As you mention, some of the functionality, like Box::new, would be in a impl<T, A: Alloc + Dealloc> for Box<T, A> { ... } and some of it, like Box::clone, in a impl<T: Clone, A: Alloc + Dealloc> for Box<T, A> { ... }.

scottjmaddox commented 5 years ago

@gnzlbg Is there any reason my suggestion for a safe new_in function would not work? There should certainly be from_raw_in, too. And Box::clone could still be in an impl<T: Clone, A: Alloc + Dealloc> for Box<T, A> { ... } block so that it's available if the allocator handle is Alloc+Dealloc.

gnzlbg commented 5 years ago

@scottjmaddox

One of the main use cases for Box<T, D: Dealloc> is FFI wrappers, where e.g. a C library gives you ownership of some value, and provides you with a function to free it. There is no way to clone that Box, or use the Box type to allocate anything else with it.

Ideally, you'd just implement Dealloc for a MyCResourceDeallocator ZST, and use Box<MyCResource, MyCResourceDeallocator> directly in C FFI.

I'm not sure how you would be able to achieve that with new_in, but I think this is a use case worth supporting.

scott-maddox commented 5 years ago

@gnzlbg I totally agree that that's a use case worth supporting, and that from_raw_in is a great way to support that. I'm just asking if there's any reason you couldn't also have the new_in I suggested, so that there's a way to use this feature without unsafe.

gnzlbg commented 5 years ago

@scott-maddox would that require implementing Alloc for MyCResourceDeallocator ?

scottjmaddox commented 5 years ago

@gnzlbg No, it would not. With my suggestion, implementing Alloc would require implementing Dealloc, but implementing Dealloc would not require implementing Alloc.

(Side note: this is the same person as scott-maddox; I meant to use this account.)

glandium commented 5 years ago

I was thinking about this a little, and I think this means there needs to be a different trait for realloc too. Because Alloc + Dealloc doesn't allow doing something specific for realloc instead of doing a dealloc + alloc sequence. So there would need to be a Realloc trait, as well as a a default impl<A: Alloc + Dealloc> Realloc for A.

gnzlbg commented 5 years ago

as well as a a default impl<A: Alloc + Dealloc> Realloc for A.

The problem with that is that, without specialization, users cannot override that impl.

glandium commented 5 years ago

Thus "default" in my sentence.

gnzlbg commented 5 years ago

I thought that wasn't intended. If that's by design, then the main downside is still that we would be blocking the stabilization of these APIs on stable specialization. I'm not sure that would make strategic sense.

glandium commented 5 years ago

I'm not sure it would need to block on stable specialization. Implementers should be able to impl Realloc for their type whether specialization is stable or not, shouldn't they?

glandium commented 5 years ago

Without a Realloc trait, Alloc should keep both realloc and dealloc methods, and there should be an impl<A: Alloc> Dealloc for A (which is what I mentioned in https://github.com/rust-lang/wg-allocators/issues/9#issuecomment-489241766 already).

gnzlbg commented 5 years ago

Is there is some already-stable magic that allows users to specialize without specialization default impls of liballoc ?

If not, your blanket impl<A: Alloc> Dealloc for A has the same problem. T

here are two impls for your allocator, the blanket one that you provide (e.g. Dealloc, and Realloc), and the one that a user might want to write. Without specialization, those two conflict.

glandium commented 5 years ago

A user wouldn't have to write a Dealloc impl if they write a Alloc impl, because dealloc is already in there.

gnzlbg commented 5 years ago

I’m not sure we are talking about the same trait hierarchy then.

I understood this issue as separating dealloc from the Alloc trait into a different trait, such that ‘trait Alloc: Dealloc { ... no dealloc here ... }’.

glandium commented 5 years ago

And I'm saying you can't detach dealloc entirely from the trait unless you detach realloc in yet another trait. Although with trait Alloc: Dealloc, that might work... but that was not the most discussed option from the topmost comment.

SimonSapin commented 5 years ago

Implementers should be able to impl Realloc for their type whether specialization is stable or not, shouldn't they?

As far as I understand, no. Such an impl would conflict with impl<A: Alloc + Dealloc> Realloc for A.

And yes, it does sound like trait Alloc: Dealloc {…} would be required so that realloc can be a default method of the Alloc trait (with a default behavior based on alloc + copy + dealloc). Is there a downside to that?

glandium commented 5 years ago

There probably isn't a downside. All I'm saying at this point is that not using specialization limits the options we have in how this can be approached to trait Alloc: Dealloc and trait Dealloc { ... } impl<T: Alloc> Dealloc for T { ... } (with the dealloc function still being in Alloc), while we've only discussed the other options so far.

glandium commented 5 years ago

As far as I understand, no. Such an impl would conflict with impl<A: Alloc + Dealloc> Realloc for A.

Tested, and that's unfortunately true. Specialization can't come soon enough :(

TimDiekmann commented 5 years ago

Tested, and that's unfortunately true. Specialization can't come soon enough :(

As specializationi is on the road map of 2019, I think we can rely on it. I don't expect the allocator_api to be stabilized in the next 6 months?

gnzlbg commented 5 years ago

As specializationi is on the road map of 2019, I think we can rely on it.

I don't share your optimism, but I do think that we should try to keep this issue on topic.

We are mixing two issues here. Whether it is worth to separate dealloc from Alloc "somehow", and whether iff we had a hierarchy or set of allocator traits (Alloc, Dealloc, Realloc, ...), how would we design that. Maybe we should open a new issue about this other point to discuss the different ways to design that.

glandium commented 5 years ago

My point is that there are four ways to go around separating dealloc from Alloc that have been proposed in this issue. Two of them have been discussed mainly, and none of those two appear to work out without having a separate Realloc.

gnzlbg commented 5 years ago

AFAICT this would work:

trait Dealloc { fn dealloc(...); }
trait Alloc: Dealloc {
    fn alloc(...) -> ...;
    fn realloc(...) -> ... { /*can call both alloc and dealloc here*/ }
}

gnzlbg commented 5 years ago

This would also work (no super trait):

trait Dealloc { fn dealloc(...); }
trait Alloc {
    type Dealloc: Dealloc;
    fn alloc(...) -> ...;
    fn get_dealloc(&self) -> &Self::Dealloc;
    fn realloc(...) -> ... { 
        /* can call both self.alloc(...) and self.get_dealloc().dealloc(...) */ 
    }
}

gnzlbg commented 5 years ago

I don't see the other approaches discussed in the issue much, but the OP mentions:

Associate the Alloc trait: trait Dealloc { type Alloc: Alloc; ... }

This does not work for the FFI use case. It would mean that to implement Dealloc for a type, you would need another type with a meaningful Alloc implementation, which for that use case does not exist (The C API gives you ownership of some memory, and a way to free it, but no way to allocate anything).

I have nothing against exploring more complicated hierarchies:

trait Alloc { /*only:*/ fn alloc(...) -> ...; }
trait Dealloc { fn dealloc(...); }
trait Realloc: Alloc + Dealloc { fn realloc(...) -> ... { /* default using alloc and dealloc */ } }
trait CollectionAllocator: Realloc + .... { ... }
struct Vec<T, A: CollectionAllocator> { ... }

or other implementation approaches, e.g., blanket impls, specialization, how would we extend those hierarchies in a backwards-compatible way if we discover later on that we need a new trait in the middle of the hierarchy, etc. but that looks like an overarching design question that can happen in parallel to this discussion.

petertodd commented 5 years ago

Yup, as long as the allocation doesn't need to be "in-place" realloc is an optimization over alloc if you already have a handle to the allocator available; if you don't realloc can succeed where alloc can't as some allocators could use the pointer to the allocation to get a pointer to the allocator. For example, Vec could be resized without a handle to the allocator.

But that design conflicts with the current one where creating zero-sized structures is a no-op, so probably not worth discussing further.

gnzlbg commented 5 years ago

For example, Vec could be resized without a handle to the allocator.

@petertodd I think we could do this by using the API proposed in #12 on all collections (not only Box<T>).

SimonSapin commented 5 years ago

The C API gives you ownership of some memory, and a way to free it, but no way to allocate anything

This sounds like this API is simply not an allocator. It has a destructor function that you are responsible for calling (because C), which is a job for the Drop trait and a wrapper trait more than for a Dealloc trait.

glandium commented 5 years ago

This would also work (no super trait):

trait Dealloc { fn dealloc(...); }
trait Alloc {
    type Dealloc: Dealloc;
    fn alloc(...) -> ...;
    fn get_dealloc(&self) -> &Self::Dealloc;
    fn realloc(...) -> ... { 
        /* can call both self.alloc(...) and self.get_dealloc().dealloc(...) */ 
    }
}

The idea being for Box<T, Dealloc> being possible we'd need the opposite. But I think we don't actually need the whole get_something approach.

trait Dealloc { fn dealloc(...); }
trait Alloc: Dealloc {
    fn alloc(...) -> ...;
    fn realloc(...) -> ... { /*can call both alloc and dealloc here*/ }
}

struct Box<T, A: Dealloc>(...);

impl<A: Dealloc> Drop for Box<T, A> { ... };

impl<T: Clone, A: Alloc> Clone for Box<T, A> { ... }

is what we'd want, presumably.

SimonSapin commented 5 years ago

I’d like that we take a step back for a moment. As library designers it can be satisfying to make APIs that are as general or flexible as possible, but do we know anyone who actually wants to use this? Or is this all hypothetical? Remember that none of this issue is relevant unless:

There’s an allocator that requires a non-zero-size handle for allocation
And that allocator does not require non-zero-size handle for deallocation
And the user is willing to give up on clone and any other API that needs to (re)allocate
And the cost of unnecessarily storing a full handle is significant

Secondly, if this is indeed a real use case, how important is it to use std::boxed::Box<T, A> for it? Could it just as well be served by a NoOpDeallocBox<T> type on crates.io?

This thread is quickly getting long, which is a sign that supporting this use case is not easy. But maybe it’s too niche to be worth the design complexity.

scottjmaddox commented 5 years ago

Now that you mention it, I think Realloc should be a separate trait, that way the collection methods that need it can be bounded on it precisely. And I don't think it needs to have a default impl. Implementing an allocator is not something to be taken on lightly. Adding one more trait impl is not that big of a deal. If we can somehow reserve the option to later add one once specialization is stable, that would be good, though.

Here's what I have in mind:


trait Alloc {
    type Realloc: Realloc;
    type Dealloc: Dealloc;
    ...
}
trait Realloc {
    type Dealloc: Dealloc;
    ...
}
trait Dealloc { ... }

struct Box<T: ?Sized, D: Dealloc>(Unique<T>, D);

impl<T: ?Sized, D: Dealloc> Drop for Box<T, D> {
    fn drop(&mut self);
}

impl<T: ?Sized, D: Dealloc, A: Alloc<Dealloc=D>> Box<T, D> {
    fn new_in(x: T, a: A) -> Box<T, D>;
}

pub struct RawVec<T, D: Dealloc> {
    ptr: Unique<T>,
    cap: usize,
    a: D,
}

impl<T: ?Sized, D: Dealloc> Drop for RawVec<T, D> {
    fn drop(&mut self);
}

impl<T: ?Sized, R: Realloc<Dealloc=R> + Dealloc> RawVec<T, R> {
    fn double(&mut self) -> RawVec<T, R>;
}

impl<T: ?Sized, R: Realloc<Dealloc=D>, D: Dealloc> RawVec<T, D> {
    fn double_in(&mut self, a: R) -> RawVec<T, D>;
}

If we switch the bounds to BuildAlloc, BuildRealloc, and BuildDealloc (see issue #12), this could potentially enable some really unique and clever allocator designs... Designs that aren't possible in any other language.

Edit: I'm tempted to go ahead and assume we'll be switching to AllocHandle, etc., and update my example, because it makes it significantly more clear...

Edit 2: Add Dealloc bound for Box struct, add RawVec struct definition, fix bounds for double

gnzlbg commented 5 years ago

This sounds like this API is simply not an allocator. It has a destructor function that you are responsible for calling (because C), which is a job for the Drop trait and a wrapper trait more than for a Dealloc trait.

@SimonSapin I don't think that works. Consider:

let b: Box<CVal, Dealloc> = c_api_call();
let cval: CVal = *b; // moves CVal into the stack, calls `Dealloc::dealloc` to free the memory
let _ = cval; // drops CVal (might do nothing, might do something)

Here, Dealloc::dealloc might call c_free_cval_memory(CVal*), and <CVal as Drop>::drop() might, e.g., do nothing (or call a different c_drop_cval() function).

Without Dealloc, I would somehow need to override the impl of Drop for Box<CVal> to be able to solve this problem with just Drop. I don't think this can be done, even with specialization, since that would need to expose the internals of Box.

scottjmaddox commented 5 years ago

I tend to prioritize allocator users over allocator implementors (traits are implemented once, but used throughout the ecosystem), and I don't see which value would this add for users.

I support prioritizing simplicity for end users, which requires maximizing power of expression for library authors. I look at this as a way to maximize power of expression for library authors. If the library author doesn't want to distinguish between Alloc and Realloc then they can just impl Alloc+Realloc for their handle type and write all bounds as A: Alloc+Realloc.

When would it be helpful to not have a bound on Realloc, but to have a bound on both Alloc+Dealloc, which would give you realloc for free ?

Firstly, Alloc+Dealloc only gives you realloc for free if you are fine with a naive implementation for realloc. I expect that only the most basic allocators will not implement realloc themselves.

Secondly, I don't know if there's an allocator that would benefit from a separate Realloc trait; this is new territory. It's not clear how one would, but we cannot know for sure that none would after just a few minutes thinking about it.

The only thing I can imagine would be to, e.g., error at compile-time if some allocator does not implement Realloc, but nothing guarantees you that aRealloc impl won't just do what the default Alloc+Dealloc impl would do, so I don't see any advantage for users of the trait over just having a realloc method in the Alloc trait.

That is a potentially interesting use case, if the allocator author wanted to make it very clear that realloc is not optimized. It's not a great use case though, since it would be kind of annoying as an end user.

Which value does this add to RawVec, the users of RawVec, like String or Vec, and the users of these types ?

Again, nothing that I can think of, but that doesn't mean there never will be a benefit for future allocators and/or collections.

If you pass Vec an A: Alloc + Dealloc, I expect the vector to be able to grow, but it won't in your case because it doesn't implement Realloc.

If the allocator author provides a handle type that is Alloc + Dealloc but not Alloc + Realloc + Dealloc then that means they don't want to allow Realloc with it for some reason. If they did, then they would just have made it Alloc + Realloc + Dealloc.

scottjmaddox commented 5 years ago

To add to my previous comment, it's possible that having a separate Realloc trait will be important for properly designing allocators that have handles with lifetime bounds, e.g. Box<T, A=ArenaAlloc<'a>>. I've only done a little bit of design along these lines and I didn't consider reallocation, so I don't know if it would end up being important or not. We need to look into this deeper.

scottjmaddox commented 5 years ago

I don't think that works. Consider:

let b: Box<CVal, Dealloc> = c_api_call();
let cval: CVal = *b; // moves CVal into the stack, calls `Dealloc::dealloc` to free the memory
let _ = cval; // drops CVal (might do nothing, might do something)

@gnzlbg I think what @SimonSapin was saying is that you could create a new wrapper type that implements drop rather than using Box. This is how FFI wrapper libraries currently work, AFAIK. Having Box<T, A:Dealloc> might make implementing the FFI wrapper a bit easier, though, since you wouldn't need a wrapper type that implements Drop for every C type.

scottjmaddox commented 5 years ago

I’d like that we take a step back for a moment. As library designers it can be satisfying to make APIs that are as general or flexible as possible, but do we know anyone who actually wants to use this? Or is this all hypothetical? Remember that none of this issue is relevant unless:

I would like to be able to have a separate Dealloc trait (or more accurately a separate BuildDealloc trait) for implementing zero-cost arena allocators.

* There’s an allocator that requires a non-zero-size handle for allocation

* _And_ that allocator does **not** require non-zero-size handle for deallocation

Huh? Neither of these is a requirement. Take the example of an arena bump allocator, and let's assume my BuildAlloc/BuildDealloc suggestion is incorporated. The BuildDealloc type would be zero sized and would be a no-op, because deallocation does nothing. Without splitting out BuildDealloc, we would have to use BuildAlloc to retrieve a pointer to the allocator state, and thus we would have to rely on the compiler optimizing away all of that, ultimately dead, code. Now perhaps it can do that optimization without issue, I don't know. But there might be other cases that I'm not thinking of that it can not easily optimize away.

* _And_ the user is willing to give up on `clone` and any other API that needs to (re)allocate

Or perhaps the user just wants to have more control over where the value is cloned to, which could be provided by a new clone_in method.

* _And_ the cost of unnecessarily storing a full handle is significant

Storing a full handle inside every box is almost always going to be prohibitively expensive.

Secondly, if this is indeed a real use case, how important is it to use std::boxed::Box<T, A> for it? Could it just as well be served by a NoOpDeallocBox<T> type on crates.io?

By this logic, we shouldn't do Box<T, A> at all. But there's value in having a first-party solution. It provides cohesion for the community.

This thread is quickly getting long, which is a sign that supporting this use case is not easy. But maybe it’s too niche to be worth the design complexity.

I don't think the length of the thread is a good metric for how easy supporting a use case is. Rather, I think it's an indication that there is interest and many possible approaches that require further discussion.

SimonSapin commented 5 years ago

Huh? Neither of these is a requirement.

I think we’re in agreement on this. I was saying that all handles are zero-size (e.g. you have a malloc-and-free-style allocator with global state) then this thread is not relevant. If even deallocation requires a non-zero-size handle (e.g. an allocator with multiple instances/arenas/regions that reuses freed space) then this thread is also not relevant.

Or perhaps the user just wants to have more control over where the value is cloned to, which could be provided by a new clone_in method.

Yes, using clone_in instead could be a reason the user is willing to give up on clone. But that’s not necessarily all potential users of an arena bump allocator.

Storing a full handle inside every box is almost always going to be prohibitively expensive.

I think this is an exaggeration. Many people use Vec<T> even though it has a 3× larger size_of than https://crates.io/crates/thin-vec. This extra size has a cost, but maybe that cost is not part of the bottleneck.

By this logic, we shouldn't do Box<T, A> at all.

Maybe! I’ve actually been considering that if we experiment outside of the rust-lang/rust repository, then we could publish that on crates.io, and people could start relying on that crate. At that point, especially if #1 proves problematic and we’d need separate types regardless, maybe a widely-accepted library on crates.io is not a bad end point?

Not everything must be in the standard library.

I don't think the length of the thread is a good metric

At least more than a yes or no like #8. And any solution would add complexity in type signatures even for people not relying on this feature. I do think we have a complexity budget to spend carefully.

scottjmaddox commented 5 years ago

I think we’re in agreement on this. I was saying that all handles are zero-size (e.g. you have a malloc-and-free-style allocator with global state) then this thread is not relevant. If even deallocation requires a non-zero-size handle (e.g. an allocator with multiple instances/arenas/regions that reuses freed space) then this thread is also not relevant.

But I'm saying that you're missing an important use case, if not more than one. I gave the arena allocator example in my last post. Having a separate Dealloc does potentially matter there.

Yes, using clone_in instead could be a reason the user is willing to give up on clone. But that’s not necessarily all potential users of an arena bump allocator.

No, it's not. But without a separate Dealloc no one can choose. With a separate Dealloc, everyone can choose precisely what features they need. The use cases served by having a separate Dealloc trait is a strict superset of the use cases served without a separate Dealloc trait.

Storing a full handle inside every box is almost always going to be prohibitively expensive.

I think this is an exaggeration. Many people use Vec<T> even though it has a 3× larger size_of than https://crates.io/crates/thin-vec. This extra size has a cost, but maybe that cost is not part of the bottleneck.

Okay, let me be more precise: storing a full handle inside every Box is unlikely to be chosen, assuming my BuildAlloc proposal is accepted. The extra overhead is unnecessary.

By this logic, we shouldn't do Box<T, A> at all.

Maybe! I’ve actually been considering that if we experiment outside of the rust-lang/rust repository, then we could publish that on crates.io, and people could start relying on that crate. At that point, especially if #1 proves problematic and we’d need separate types regardless, maybe a widely-accepted library on crates.io is not a bad end point?

Not everything must be in the standard library.

I do think experimenting with all of this in separate crates is a good idea. Iteration can happen much faster outside of the std lib. This would also make it much more feasible to directly compare the performance of having a separate Dealloc for arena allocators, for example.

Are there any limitations that currently prevent a full-featured custom Box type? I had played with something like this for a custom arena allocator a bit over a year ago, but ended up dropping it after a couple days.

I don't think the length of the thread is a good metric

At least more than a yes or no like #8. And any solution would add complexity in type signatures even for people not relying on this feature. I do think we have a complexity budget to spend carefully.

The added type signature complexity is definitely a concern. If separate and full-featured BoxIn, etc. types can be created in a crates.io crate, then I do think that's a better place to start.

SimonSapin commented 5 years ago

Are there any limitations that currently prevent a full-featured custom Box type?

Leaving aside features that “merely” require Nightly (e.g. implementing the CoerceUnsized trait), one feature of std::boxed::Box that is built into the language and cannot (today) be replicated by a library is moving a !Copy value out of a box. There’s some desire to eventually have a DerefMove trait, but it doesn’t exist yet.

scottjmaddox commented 5 years ago

Leaving aside features that “merely” require Nightly (e.g. implementing the CoerceUnsized trait), one feature of std::boxed::Box that is built into the language and cannot (today) be replicated by a library is moving a !Copy value out of a box. There’s some desire to eventually have a DerefMove trait, but it doesn’t exist yet.

That's a pretty big limitation...

rust-lang / wg-allocators

Separate `dealloc` from `Alloc` into other trait #9