rust-lang / wg-allocators

Home of the Allocators working group: Paving a path for a standard set of allocator traits to be used in collections!
http://bit.ly/hello-wg-allocators
203 stars 9 forks source link

Why is allocator type part of container types? #117

Open Ipotrick opened 9 months ago

Ipotrick commented 9 months ago

In rust it seems that types that take custom allocators like box and vec take it as a type parameter. C++ also did this with their custom allocators and this is now seen as a big mistake.

The problem with making the allocator part of the type is that you then can not use custom allocators at all in functions and libraries and existing code that except the default container eith no custom allocator.

Writing code that uses allocators as type parameters in containers is very annoying as it needs to be completely genericified.

Amanieu commented 9 months ago

What other alternative is there? A container needs to know how to deallocate its memory when dropped, which means it needs to know which allocator to return the memory to.

zakarumych commented 9 months ago

The only alternative is polymorphic allocators which is possible by using polymorphic type of allocator for container.

But making it default will cause a huge performance problem.

The issue mostly comes from the fact that Allocator trait is still unstable so most developers ignore that there's parameter to compile on stable. Majority of crates just work that way.

Ipotrick commented 9 months ago

Do polymorphic allocators really pose a huge performance problem? I believe Zig (which also has very high performance standarts from what i gather) exclusively uses polymorphic allocators and i have not yet seen anyone concerned about that. I would guess that most allocation operations far outweigh the cost of the one polymorphic indirection they would incur except for maybe a linear allocator.

I realise that the allocator ptr needs to live somewhere and it will take 8 bytes, i think that would be the greater cost. But this can be fixed by having for example storing the allocator ptr on the heap (and maybe in size if capacity is 0) then mark with one bit in capacity if a custom allocator is present. This way there would be more work done but less memory used.

I guess it would be very interesting to test an implementation that indirects all allocations to a dyn trait to the global allocator for a few projects and test the difference.

In my opinion custom allocators are very second class citizen and wont be used by people that want custom allocators much due to the big inconvenience of the type beeing changed by allocator. The same thing happened in c++.

Ipotrick commented 9 months ago

I guess the question here is if the first class custom allocator support is worth this indirection). In my opinion it very much is from a guess, but i would also like to have performance metrics on this.

Amanieu commented 9 months ago

The easiest way to test this would be to modify Vec in the standard library to use a polymorphic allocator by default. If you create a PR on rust-lang/rust then you can easily run it through the perf benchmarks.

zakarumych commented 9 months ago

Polymorphic allocator removes possibility to optimize out allocations altogether

elichai commented 9 months ago

Polymorphic allocator removes possibility to optimize out allocations altogether

Why? Can't we still mark them as alloc/dealloc calls in LLLVM and then it can optimize them/around them? (does LLVM support marking dynamic calls as alloc/dealloc calls?)

chorman0773 commented 9 months ago

Neither Vec nor Box can store a polymorphic allocator.

Box because Box<T> is promised to be exactly the size of a pointer, and Vec<T> because this round trip is legal:

pub fn launder_vec<T>(x: Vec<T>) -> Vec<T>{
       let mut x = ManuallyDrop::new(x);
       let cap = x.capacity();
       let len = x.len();
       let ptr = x.as_mut_ptr();
       unsafe{Vec::from_raw_parts(ptr,len,cap)}
}
NobodyXu commented 9 months ago

Vec::from_raw_parts is only implemented for Vec<T, Global> and AFAIK the "promise" for Box<T> to have the same size as pointer is only for Box<T, Global>, otherwise it cannot support allocator at all since allocator isn't necessary zero-size.

chorman0773 commented 9 months ago

Yes - hense why those types cannot store a polymorphic allocator by default. Even if "Polymorphic Allocator" was just a specialization for Vec<T,Box<dyn Allocator>>, this can't be the default, as that code is stable for Box<T> and Vec<T>.

Skgland commented 9 months ago

Also, how would that work for Box<T, Box<dyn Allocator,?>>, which Allocator would (de-)allocate the Box containing the Allocator?

burdges commented 9 months ago

I doubt Box<dyn Allocator,?> makes sense, and indeed a boxed Allocator does not satisfy Allocator, but &dyn Allocator makes sense.

We're presumably discussing a type for Allocator ZSTs like

pub struct DynAllocator(usize);

impl DynAllocator {
    pub fn new<A: impl Allocator>() -> DynAllocator {
        assert!(size_of::<A>() == 0, "Use dyn Allocator if you need non ZST dynamic allocators");
        ...
    }
}

impl Deref for DynAllocator {
    type Target = dyn Allocator;
    ...
}

impl Allocator for DynAllocator { ... }

Box<T,DynAllocator> is only 2 usizes, while Box<T,&dyn Allocator> is 3 usizes.

In practice, you maybe want dynamic allocators to have type level ZSTs, so Box<T,A> is 1 usize, into which you register a &'static dyn Allocator once per program run. It'd still break some optimizations though.

NobodyXu commented 9 months ago

Yes - hense why those types cannot store a polymorphic allocator by default. Even if "Polymorphic Allocator" was just a specialization for Vec<T,Box<dyn Allocator>>, this can't be the default, as that code is stable for Box<T> and Vec<T>.

Oh I see, you were talking about removing the Alloc generic parameter, sorry I didn't realise that.

CAD97 commented 9 months ago

Note that (for a per allocation overhead) the global allocator can be made polymorphic. When creating a new allocation, instead overallocate additional space for a pointer (and pad to alignment).

The cost of the extra indirection is likely dominated by the cost of the actual allocation work when using static general purpose allocators, which are typically already behind an optimization barrier. But the performance cliff typically isn't the extra work, it's causing "nicely" sized/dense allocations to jump up a bucket size. Allocators are optimized to serve common allocation sizes better, and code is tuned to use allocations of fast sizes (e.g. fitting cache lines). Artificially inflating allocation sizes does a huge number

Polymorphic allocation imo makes a bit more sense as a default in C++ since many/most allocating (single ownership single object) templates tend to use C++ copy/move semantics to store small objects in-line without a dynamic allocation (thus somewhat mitigating the outsized cost to tiny allocations of pmr techniques). Template metaprogramming also can potentially do wacky magic to share an allocator handle between objects rather than needing to store it individually on every object that might do a dealloc.

lolbinarycat commented 2 months ago

Do polymorphic allocators really pose a huge performance problem? I believe Zig (which also has very high performance standarts from what i gather) exclusively uses polymorphic allocators and i have not yet seen anyone concerned about that. I would guess that most allocation operations far outweigh the cost of the one polymorphic indirection they would incur except for maybe a linear allocator.

zig can do that because it isn't memory safe like rust, and it also doesn't have smart pointers. the programmer is in charge of correctly freeing allocated memory, not the compiler.

yanchith commented 1 month ago

To the OP's point:

Writing code that uses allocators as type parameters in containers is very annoying as it needs to be completely genericified.

I've been using allocators intensively since around 2021, and this has indeed been quite unpleasant. The pain is even larger once you start storing allocator-generic collections in structs, or have your functions take multiple allocator type parameters.

Over time, I managed to find a middle ground that's doesn't seem that bad to me:

In library code, e.g. when writing new collections or something that really needs to be generic over allocators, I use A (and my soul dies a little each time I write A: Allocator + Clone). Fortunately, there is not that much library code and the in-house libraries are on the small side.

In application code, I find I typically use just two types of allocators. When I care about how memory looks, I use a set of arena allocators (backed by auto-growing virtual memory), passed down as a first parameter to each function, without generics. When I don't care about memory, I use Global. Typically, a piece of application-level code either cares or doesn't, but not both. So in application code using arenas, I have this:

fn make_teleport_particles(particles: &mut Particles<&Arena>, a: Vec3, b: Vec3) -> Id;

... instead of this ...

fn make_teleport_particles<A: Allocator + Clone>(particles: &mut Particles<A>, a: Vec3, b: Vec3) -> Id;

Still not great, hurts my eyes less.