Pre-proposal: Type-aware allocators

Avi-D-coder commented 4 years ago

Currently this use case is supported for the Alloc trait, but not GlobalAlloc.

As discussed in #18 generic methods (I.e{alloc,dealloc}_one) cannot be used to support this use case in GlobalAlloc and maybe removed from Alloc. An alternative that is consistent across both methods is needed.

Type information is necessary for ECS allocators, typed Pool/Slab allocators, many GC, STM optimizations, and describable for heap profilers, and exotic heuristic guided allocators as a significant source of free information.

Today there are many slab and generational indexed allocators. They are incompatible with box, std, and each other. They all provide some version of the following: allocation and deallocation of T: Sized, indexing/dereferencing and usually iterating over Ts. By passing unsized allocations to an underlying or global allocator. They could easily implement the Alloc trait to take advantage of standard collections and boxes, but to do this they require knowing more than just Layout, they require TypeId.
I am working on a allocator for use a with both Alloc and GlobalAlloc that requires typed pools of Sized T to support pause free optional GC. Without a method of differentiating allocations and deallocations that share the same Layout, but different types my use case is impossible.

In order to support all the mentioned use cases TypeId needs to always be provided to allocations and deallocations of Sized Types and N Sized Types . Layout looks like the best place to ensure TypeId is always passed.

Layout would have type_id_ added and TypeId implementation would be changed to NonZeroU64. type_id_ would not be considered by Eq, so as not to break compatibility.

pub struct Layout {
    size_: usize,
    align_: NonZeroUsize,
    type_id_: Option<TypeId>
}

Drawbacks

Adding another field to Layout could have a tiny performance cost.

alternatives

Originally I had wanted a separate set of functions for this use case analogous to Alloc::{alloc,dealloc}_array and Alloc::{alloc,dealloc}_one (like below), but this allows for human error (E.g calling the normal unsized set of functions when you have a sized type).

alloc_typed(&mut self, layout: Layout, _: TypeId) -> Result<NonNull<u8>, AllocErr> {
  alloc(self, layout)
}

RustyYato commented 4 years ago

I think it would be better to add a new trait for type aware allocators, such as

trait TypedAlloc<T> { 
   fn alloc_array(n: usize) -> Result<NonNull<[T]>, AllocErr>;
   unsafe fn dealloc_array(ptr: NonNull<[T]>);

   fn alloc_one() -> Result<NonNull<T>, AllocErr>;
   unsafe fn dealloc_one(ptr: NonNull<T>);

    // maybe more ...
}

Avi-D-coder commented 4 years ago

@KrishnaSannasi That's the API we have today in the Alloc trait see https://github.com/rust-lang/wg-allocators/issues/18#issuecomment-530322973 for why generic can't be in GlobalAlloc(tldr linking).

Also making a separate trait without the ability to handle multiple or unsized types mean that standard collections could never use such a trait.

RustyYato commented 4 years ago

@Avi-D-coder

That's the API we have today in the Alloc trait see #18 (comment) for why generic can't be in GlobalAlloc(tldr linking).

Ok, but I don't think we can make any typed allocator a global allocator. So this shouldn't matter, right?

Also making a separate trait without the ability to handle multiple or unsized types mean that standard collections could never use such a trait.

I was proposing a simple solution, we could change it or make it more sophisticated to better handle the standard collections' use case. Now that I've thought about it a bit more, we could change TypedAlloc to

trait TypedAlloc<T: ?Sized> {
    type Meta;

   fn alloc(meta: Self::Meta) -> Result<NonNull<T>, AllocErr>;
   unsafe fn dealloc(ptr: NonNull<T>);
    // maybe more ...
}

Where we could use it like this, playground. I think this should support the use cases for standard library types well enough.

Avi-D-coder commented 4 years ago

@KrishnaSannasi My proposal passes type info to Global Allocators and Alloc by replacing use cases of alloc_one<T> and alloc_array with a field in Layout type_id_: Option<TypeId>.

Even if we were to exclude the GlobalAlloc use case in order to allow generics. Your proposal does not allow a collection to take as a type parameter a single allocator, since a collection may need to allocate multiple types. One of the design requirements on Allocators is they must handle multiple types. In other words there is no possible design where a trait takes a type parameter or an associated type indicating what it allocates.

RustyYato commented 4 years ago

@Avi-D-coder Oh, I misunderstood. Thanks for clarifying.

gnzlbg commented 4 years ago

FWIW, both solutions aren't really incompatible with each other (we could have both). While currently GlobalAlloc cannot use generic methods, a TypedAlloc trait could offer type-generic methods.

That is, a GlobalAlloc would need to have run-time logic for handling the TypeIds, but a TypedAlloc could potentially use static dispatch depending on the type to do different things for different types., e.g., using specialization.

Avi-D-coder commented 4 years ago

@gnzlbg your right they are compatible, but this proposal is specifically about always passing type information from boxing one or N of a T, not type-specific allocators. Once specialization lands TypedAlloc: Alloc would be interesting, but it's orthogonal to my use case.

ObjectAlloc is very similar to TypedAlloc.

comex commented 4 years ago

Note that another recent RFC is proposing to allow transmutes between Vec<A> and Vec<B> as long as A and B have the same size and alignment. This would allow a Vec's contents to be deallocated with a different TypeId than the one it was allocated with.

gnzlbg commented 4 years ago

@comex That would forbid Vec from using the type aware allocator API, but I don't see a reasonable Vec implementation that would use such an API.

gnzlbg commented 4 years ago

Adding another field to Layout could have a tiny performance cost.

Currently Layout is two usizes. Most ABIs allow passing and returning two usizess in registers, which means that returning Layout from a function is as efficient as just returning an usize.

This proposal adds another two fields to Layout (the enum discriminant and the u64 TypeId), changing how Layout is passed on most ABIs, from registers, to requiring that it must be passed via memory.

I'm not sure of what the performance implications of that are, but would rather not do it I think.

Avi-D-coder commented 4 years ago

@gnzlbg The proposal includes changing TypeId from u64 to NonZeroU64. Option<std::num::NonZeroU64> is packed into 8 bytes. So this changes requires 8 bytes more be passed as an argument.

Most ABIs allow passing and returning two usizes in registers, which means that returning Layout from a function is as efficient as just returning an usize.

Do any calling convention besides fastcall make this restriction? I believe fastcall is the only one. More modern conventions seem to mostly allow 4+ integer registers.

I would also suspect LTO to remove the unused argument for allocators that don't use TypeId?

This is all likely moot, since https://github.com/rust-lang/rust/issues/26494 implies Layout and all small structs are passed on the stack anyway. In other words, in practice there will be no measurable performance cost. If your still not convinced I could implement the change and benchmark it?

Avi-D-coder commented 4 years ago

@comex If that RFC goes through, type-aware allocators will just have to account for deallocation not having an accurate TypeId. From my perspective it's not a deal breaker, just an interesting edge case.

gnzlbg commented 4 years ago

I would also suspect LTO to remove the unused argument for allocators that don't use TypeId?

No, this isn't possible. The arguments are always passed according to the calling convention, and the global allocator functions cannot, in general, be inlined. You can, e.g., change the allocator at run-time, using LD_PRELOAD.

Do any calling convention besides fastcall make this restriction?

I thought actually that most ABIs make this restriction for return types (not arguments), and we have a couple of APIs that return Layouts IIRC.

gnzlbg commented 4 years ago

This is all likely moot, since rust-lang/rust#26494 implies Layout and all small structs are passed on the stack anyway.

We pass scalars and scalar pairs in registers, and have two separate ABI classes for those in the Rust ABI (Abi::Scalar and Abi::ScalarPair)

Avi-D-coder commented 4 years ago

@gnzlbg That could be a problem. Is there a reason the rust ABI could not have aScalarTriple? Does a calling convention agnostic optimization exist to pass the first two fields of a struct in registers and not the third?

I'll have to benchmark this. Is there a possible way to provide type information to a global allocator, that does not hit this restriction? If not, it's trade off between avoiding passing a struct on the stack or ever having type-aware global allocators and all the optimizations and concurrency advantages they can provide.

gnzlbg commented 4 years ago

I think you are focusing too much on how to implement something, and not on making a case for this feature. To me at least it is very unclear:

What is precisely the problem that this API should solve? (you mention many cases, but do not explain any in detail)
What are the constraints on solutions to that problem? (you mention that this feature enables many optimizations in those cases, but do not explain any of those optimizations in detail)
What would the semantics of potential solutions be? (you only focus on syntax, e.g., how do we pass the TypeId)

I see many fundamental problems with the approach. First:

In the rust abstract machine memory is untyped, so one can allocate memory for an Option<bool>, and transmute that in place to a bool, and free the bool. A lot of correct code does this (which is why @comex probably mentioned the case of transmuting Vec<A> to Vec<B> above).
TypeIds are not unique (e.g. see https://github.com/rust-lang/rust/issues/10389), so you can't rely on them for differentiating allocations of different types. (this also means that the actual layout of TypeId is unspecified and can change).

This means that the use cases that rely on accurate type information cannot really work, and that TypeId can at most be a hint.

So if this is a hint, would such a hint be useful?

No major allocator supports this (malloc, jemalloc, mimalloc, tcmalloc, weealloc, hurd, ...).

Some of the std::collections might be able to use it as a hint, e.g., Vec<T>/Deque<T>, but most other collections use the Layout builder methods to concatenate Layouts. It is unclear to me what would the result of concatenating two layouts with two different Option<TypeId>s be ? Probably None ? Some of the use-cases you mention, e.g., ECS, already use their own data-structures right ?

I can imagine that one can build an allocator that exploits TypeId, and that one could ship collections that are specifically written to use a TypeId-aware allocator for profit, and that one might want a different TypedAlloc trait for those, but making the standard collections exploit this would require changing their implementation, potentially significantly for the node based ones. And even then, most users using widely-used allocators won't benefit from it.

Avi-D-coder commented 4 years ago

I should have real code to publish, in the next couple of weeks. I'll publish a detailed write up with it including API requirements, trade-offs, benchmarks, etc... For my use case transmuting is not an issue. Good to know about TypeId being a hash, it's not ideal but pretty simple to guard against in practice.

glandium commented 4 years ago

What's unclear to me is why that should be provided via GlobalAlloc. That's a very specific use case, and it could be fulfilled by opting in to using Alloc-aware types.

Avi-D-coder commented 4 years ago

I no longer believe passing TypeId is the correct solution. Specialization on generics is too useful a property. I'm going to need to do some more experimenting to find a better solution.

rust-lang / wg-allocators

Pre-proposal: Type-aware allocators #27

Drawbacks

alternatives