rust-lang / wg-allocators

Home of the Allocators working group: Paving a path for a standard set of allocator traits to be used in collections!
http://bit.ly/hello-wg-allocators
203 stars 9 forks source link

Add optional typed variants of allocation and deallocation routines #89

Open matklad opened 3 years ago

matklad commented 3 years ago

In rust-analyzer, one of the biggest issues for us is understanding where the memory goes. Which types are responsible for which fraction of the heap at any given moment. My understanding is that this is an unsolvable problem at the moment. Heap parsing requires pervasive use of something like servo's HeapSizeOf, and allocator instrumentation gives a direct answer to "where allocations happen" rather than "which objects are heavy".

I think the situation can be markedly improved if during allocation we supplied type information about what object will be the owner of the memory. So, something like this:

pub unsafe trait GlobalAlloc {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8;

    #[inline(always)]
    unsafe fn alloc_ty<T>(&self, layout: Layout) -> *mut u8 { self.alloc(layout) }
}

The implementation of something like Vec<u64> would then use alloc_ty::<Vec<u64>> rather than a plain alloc.

That would allow application writers implement tracking allocators:

impl<A: GlobalAlloc> GlobalAlloc for TrackingAllocator<A> {
    unsafe fn alloc_ty<T>(&self, layout: Layout) -> *mut u8 { 
        let key = std::any::type_name::<T>();
        self.record_allocation(key, layout.size());
        self.a.alloc(layout) 
    }
}

I am not sure if the solution I am proposing here works at all or whether it's worth it (the naive implementation would lead to monopolization bloat).

I do think that the problem is worth solving though!

Amanieu commented 3 years ago

This doesn't work for the global allocator since the function calls are resolved at link time: we can't monomorphize alloc_ty in an upstream crate when #[global_allocator] is defined in a downstream crate.

Something you could do instead is use a wrapper Allocator with a container:

struct TrackedAlloc<A: Allocator> {
   key: TypeId,
   alloc: A,
}

impl Allocator for TrackedAlloc<A> {
    fn alloc(&self, layout: Layout) -> Result<NonNull<u8>, AllocErr> {
        record_allocation(self.key, layout);
        self.alloc.alloc(layout)
    }
}
matklad commented 3 years ago

Hm, that indeed doesn't seems to work for global alloc in the current setup... That's a shame -- adding a custom allocator to every type in a large project is a huge undertaking, more or less comparable to tagging everything with #[derive(HeapSizeOf)]. That's not really comparable to just setting the global allocator in one place and getting a useful histogram after program run.

I wonder if there's some kind of simple key/tag we can use here, like TypeId, but which works for every type.

My current understanding is that we sort-of can design something like this, but then the design space would be too big to practically put in std...

matklad commented 3 years ago

Figured a work around -- the current TypeId can be tracked by the user at runtime: https://github.com/rust-analyzer/rust-analyzer/issues/9309

Roughly, the API could look like this:

let xs: Vec<Item> = {
    let _a = AllocTag(typed_id::<Vec<Item>>).push();
    iter().map(...).filter(...)
        .collect();
}; // all allocations which are live at this point will be tagged with `Vec<Item>` type id.