rust-lang / wg-allocators

Home of the Allocators working group: Paving a path for a standard set of allocator traits to be used in collections!
http://bit.ly/hello-wg-allocators
209 stars 9 forks source link

Separate `dealloc` from `Alloc` into other trait #9

Closed TimDiekmann closed 4 years ago

TimDiekmann commented 5 years ago

Most (all?) of the structs mentioned in #7 only needs the dealloc method for Drop. It'd may be useful to split up Alloc into two traits. We didn't came up however with the exact layout and relationship with those two traits. So far, those possibilities showed up:

Edits

TimDiekmann commented 4 years ago
trait Alloc {
    type Realloc: Realloc;
    type Dealloc: Dealloc;
    ...
}
trait Realloc {
    type Dealloc: Dealloc;
    ...
}

@scottjmaddox I don't really see the point of splitting Alloc and Realloc when Alloc requires a Realloc. Wouldn't the other way makes more sense such Realloc has an associated Alloc? Or even leave out the association at all?

TimDiekmann commented 4 years ago

I updated the OP. If any proposal is missing, please @TimDiekmann me :slightly_smiling_face:

scottjmaddox commented 4 years ago

@scottjmaddox I don't really see the point of splitting Alloc and Realloc when Alloc requires a Realloc. Wouldn't the other way makes more sense such Realloc has an associated Alloc? Or even leave out the association at all?

You might be right. I suppose having the realloc method defined in Alloc is the same thing. And then it can have a default implementation of just alloc new, move, dealloc old. No need for trait specialization.

TimDiekmann commented 4 years ago

I'm currently experimenting with this: https://github.com/TimDiekmann/alloc-wg

So far I'm using three traits together, each associated with an own BuildAlloc. Currently I only tested Box, which don't need realloc at all. I think when we can decide if it is useful, when implemented RawVec.

TimDiekmann commented 4 years ago

I don't think splitting Realloc makes much sense here. For RawVec::reserve_internal the bounds A: Alloc + Realloc + Dealloc would be needed:

Maybe it would make sense to introduce this hierarchy:

trait Dealloc {}
trait Alloc: Dealloc {}
trait Realloc: Alloc {}
TimDiekmann commented 4 years ago

I actually found a use case for splitting Realloc: An allocator without Realloc cannot move memory around until deallocating.

scottjmaddox commented 4 years ago

Interesting point. Since we already have, Pin, though, are there any cases where you might want that? Perhaps for FFI interactions?

TimDiekmann commented 4 years ago

I'll try to summarize this issue.

We propose splitting up dealloc and realloc from AllocRef. Splitting the trait allows users to specify increasing constraints depending on the use case. For example it's possible to use FFI allocated memory in an allocator by only implementing DeallocRef. Leaving out implementing ReallocRef will ensure, that the returned pointer of alloc will be valid for the provided layout until dealloc was called; the memory will never move.

In the alloc-wg crate I'm using this (adopted) design:

trait DeallocRef {
    unsafe fn dealloc(&mut self, ptr: NonNull<u8>, layout: Layout);
}

trait AllocRef: DeallocRef {
    fn alloc(&mut self, layout: Layout) -> Result<NonNull<u8>, AllocErr>;

    fn alloc_zeroed(&mut self, layout: Layout) -> Result<NonNull<u8>, AllocErr> {
        // fallback to `alloc` and `write_bytes`
    }

    // ... other methods neither listed in `DeallocRef` nor `ReallocRef`
}

trait ReallocRef: AllocRef {
    fn realloc(
        &mut self,
        ptr: NonNull<u8>,
        old_layout: Layout,
        new_size: usize,
    ) -> Result<NonNull<u8>, AllocErr> {
        // fallback to `alloc`, `ptr::copy_nonoverlapping`, and `dealloc`
    }
}

Further I'll discuss the reason, why I have not used associated types on AllocRef and ReallocRef instead.

The implementation for this would probably look like this:

trait DeallocRef {}

trait AllocRef {
    type DeallocRef: DeallocRef;
}

trait ReallocRef {
    type AllocRef: AllocRef;
}

The main advantage on the associated type approach is, that you could use the same DeallocRef for different AllocRefs. Also it's possible to use the same AllocRef for different ReallocRefs.

A typical call to reserve on a collection has at least one branch:

if self.is_empty() {
    alloc()
} else {
    realloc()
}

This means, that the collection needs access to potential two different allocators at once in one function. There are three ways I came up with:

  1. Storing both allocators alongside the deallocator in the struct. This also introduces more generic parameters
  2. Require the two allocator to be the same struct. As this is the same as the first solution, this solves nothing.
  3. Usage of trait BuildDealloc {}, trait BuildAlloc: BuildDealloc {}, and trait BuildRealloc: BuildAlloc {} similar to #12. The collection would store one builder and the generic parameter would be the builders type. While every builder can build a DeallocRef, some builders may also build an AllocRef or ReallocRef. A huge downside of this is type inference: https://github.com/TimDiekmann/alloc-wg/issues/5.

So the only viable solution would be 1., but I don't think it's worth using three parameters for allocating.

Another (IMO) minor downside of using this approach is the lack of a default implementation for realloc.

Hybrid approaches of both worlds are also possible, but the same downsides applies to them as well.

Amanieu commented 4 years ago

I actually found a use case for splitting Realloc: An allocator without Realloc cannot move memory around until deallocating.

I don't understand your point here. Since realloc can simply be implemented as alloc+copy+dealloc, it is irrelevant whether the allocator specifically supports it or not.

I think the AllocRef/DeallocRef split is already introducing quite a lot of complexity, adding a ReallocRef to this feels unnecessary and counterproductive.

TimDiekmann commented 4 years ago

I think you are right, that there a no real use cases for ReallocRef. I think the only minimal advantage would be a possible specialization, in case ReallocRef not implemented, but that doesn't outweigh by far the advantages of a third trait.

scottjmaddox commented 4 years ago

Can someone summarize the value of a separate Dealloc trait? The only advantage I can remember is that it would be easier to optimize away access when Dealloc is a no-op, e.g. in a bump allocator. Are there others? If not, we should probably just see if the compiler has trouble eliminating the dead code in this case. If it doesn't, there's (probably?) no need for this.

TimDiekmann commented 4 years ago

You can use memory, which you don't know how to allocate. For example you may use a pointer from FFI and want to deallocate it in Rust.

scottjmaddox commented 4 years ago

In that case, would it make more sense to just implement Drop on a wrapper type? I think that's standard practice for FFI right now.

TimDiekmann commented 4 years ago

If you get a pointer to an array, you could just use it in a Vec with all it's features.

Amanieu commented 4 years ago

Basically the idea is that Dealloc doesn't need to contain a pointer to the allocator itself, it can "derive" that pointer from the pointer passed in to the dealloc call.

The intended use case is for Box to only require Dealloc for dropping, which would avoid needing to double the size of Box.

scottjmaddox commented 4 years ago

In the case of Vec, am I correct in thinking that functionality would be limited, or at least different for the Dealloc-only types? For example, push would not be allowed, since it would fail when at capacity.

Lokathor commented 4 years ago

wrapping an array pointer as a vec that knows how to dealloc but not alloc sounds like a bad time.

Similarly, boxes that do nothing on drop sounds very niche. Even a frame allocator should be ref counting the allocations made and dropped or you can't safely reset the allocator.

I'm not saying there's no possible case for this, but those two cases don't feel like they hold up

scottjmaddox commented 4 years ago

Similarly, boxes that do nothing on drop sounds very niche.

Bump allocators aren't that niche. They're used quite frequently when allocation is a bottleneck.

Even a frame allocator should be ref counting the allocations made and dropped or you can't safely reset the allocator.

Not if you leverage lifetimes; you can have the compiler prove that the allocator is safe to reset.

TimDiekmann commented 4 years ago

Many FFI APIs returns a pointer to an array. What's wrong with just putting this in a vec-like struct and use it? The collection will handle the deallocation later. The same applies to boxes.

Amanieu commented 4 years ago

Many FFI APIs returns a pointer to an array. What's wrong with just putting this in a vec-like struct and use it? The collection will handle the deallocation later.

I would expect such pointers to go into a boxed slice, not a vec. Basically, I see Box as the only potential use case for a separate Dealloc trait. Any complex collection will want to use the full Alloc trait.

Lokathor commented 4 years ago

A pointer into a foreign-allocated array needs to be a slice on the rust side of things. Or something like a "slice vec" type. Or other abstraction that is designed around being aware it can't realloc the memory. A normal Vec is not that type.

TimDiekmann commented 4 years ago

I propose to split the AllocRef trait like the first proposal in https://github.com/rust-lang/wg-allocators/issues/9#issuecomment-578498282.

DeallocRef is not dependent on AllocRef. With this design, it's possible to have the concept of a deleter like in C++s' std::shared_ptr This can be especially useful in FFI application, where Rust receives an allocated memory block and a dealloc function.

We could stay conservative, and only split dealloc, but as the changes will be nightly only for now, we could also split realloc and see, how people react.

The next step is review by the rest of the tagged wg members:

split dealloc split realloc postpone close
@Amanieu :confused:
@Ericson2314
@glandium
@gnzlbg
@Lokathor :-1:
@scottjmaddox :+1:
@TimDiekmann :+1:
@Wodann :+1: :heart:

Please vote with

JelteF commented 4 years ago

I think based on these comments splitting dealloc indeed makes sense for FFI, since there you can free but not allocate. Example: https://github.com/facebook/rocksdb/blob/master/include/rocksdb/c.h#L1747

Splitting realloc I aggree with @Amanieu, there's no need for it. If you don't have a realloc function in your allocater, the default implementation of malloc + memcpy should work fine.

Regarding:

A pointer into a foreign-allocated array needs to be a slice on the rust side of things. Or something like a "slice vec" type. Or other abstraction that is designed around being aware it can't realloc the memory. A normal Vec is not that type.

You can use a Box<[T]> (boxed slice) for that. Vec requires an append implementation, which I don't see a way of implementing without alloc/realloc.

TimDiekmann commented 4 years ago

If we decide to introduce either of the traits, this would be done in three steps:

  1. Provide an empty trait for DeallocRef (and ReallocRef)
  2. Update miri and the nomicon to import both traits
  3. Actually split AllocRef
Wodann commented 4 years ago

If we think that the ReallocRef change is going to be controversial, it might be a good idea to split it into two separate PRs; one for splitting the DeallocRef trait and one for splitting the ReallocRef trait. That way if one gets rejected/reverted, the rest of the work won't be affected.

TimDiekmann commented 4 years ago

I'm fine with that.

Does anyone want to write the documentation for DeallocRef and adjust AllocRef a bit? Update: @Wodann will do this

Wodann commented 4 years ago

Do you still want to create the actual PR to rustc? It might be good if it is the same person making PRs (for recognisability). In that case someone'd need to push/pr to your fork.

Amanieu commented 4 years ago

I am strongly opposed to having a separate ReallocRef trait. There is no good reason for splitting it away from AllocRef when a perfectly good default implementation needs only alloc and dealloc.

Regarding DeallocRef I am concerned that the use case for it is somewhat niche: it only really helps Box, which isn't really used that much with custom allocators (as far as I know). The use case I am currently looking at is a compiler where I use a bumpalo instance per function, so that I can quickly allocate memory for internal data structures while compiling (mainly Vec and HashMap) and discard them all once I am done compiling a function.

Lokathor commented 4 years ago

Just leak the box and keep the &mut around however long you would have kept the box around

TimDiekmann commented 4 years ago

@Wodann

Do you still want to create the actual PR to rustc? It might be good if it is the same person making PRs (for recognisability).

Yes, I think that makes sense.

In that case someone'd need to push/pr to your fork.

One could also make a PR to alloc-wg, or post the docs here, in zulip, or send it to me via EMail or PM in Discord :slightly_smiling_face:

@Amanieu @Lokathor Could you also give a vote?

Lokathor commented 4 years ago

i vote this idea is just way too niche

scottjmaddox commented 4 years ago

Regarding DeallocRef I am concerned that the use case for it is somewhat niche: it only really helps Box, which isn't really used that much with custom allocators (as far as I know).

That's one of the things this working group is trying to fix, though. It would be much better if Rust std lib collections had first-class support for custom allocators.

TimDiekmann commented 4 years ago

Can I assume that the majority voted for splitting dealloc (not realloc!) and I can go ahead? While @JelteF and @stevenlr are not tagged, I don't want to ignore those votes as we don't have a fixed membership.

stevenlr commented 4 years ago

I originally voted to also split realloc for consistency but @Amanieu's arguments had me change my mind. Updated my vote above, thanks for taking it into account. :)

TimDiekmann commented 4 years ago

Whops, wrong Issue... Sorry for the noise

TimDiekmann commented 4 years ago

Will close this as we probably won't implement this. If we will do, we can still reopen it (again).