rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.32k stars 12.72k forks source link

Alloc: Add owns method #44302

Closed joshlf closed 5 years ago

joshlf commented 7 years ago

As Andrei Alexandrescu discusses in this great talk, in order to make it possible to construct new allocators by composing existing ones, each allocator must support an ownership check.

Consider, for example, a "fallback allocator" - an allocator which operates by wrapping two other allocators, serving allocation requests by trying to allocate from the first allocator and falling back to the second one if the first request fails. In Rust:

fn alloc(&mut self, layout: Layout) -> Result<*mut u8, AllocErr> {
    if let Ok(ptr) = self.first.alloc(layout.clone()) {
        Ok(ptr)
    } else {
        self.second.alloc(layout)
    }
}

In order to service deallocation requests to this allocator, however, you need to be able to detect which allocator a given pointer came from. If we assume an owns method, this looks like:

fn dealloc(&mut self, ptr: *mut u8, layout: Layout) {
    if self.first.owns(ptr, layout.clone()) {
        self.first.dealloc(ptr, layout);
    } else {
        self.second.dealloc(ptr, layout);
    }
}

As Andrei discusses in that talk, modern allocators are nothing but composition - size classes, different allocation strategies, falling back to mmap for certain allocations, etc. Thus, I think that it's important that we take composability seriously, and ensure that our Alloc trait is composable. This requires adding an owns method. Concretely, I'm suggesting the following signature be added to the Alloc trait:

fn owns(&self, ptr: *mut u8, layout: Layout) -> bool;

cc @Ericson2314 @pnkfelix @alexcrichton

joshlf commented 7 years ago

@ezrosent has pointed out an issue with this: my own mmap-alloc allocator wouldn't be able to implement owns since there's no way to detect who mapped a given page of memory.

Thus, maybe it makes more sense for this to be a separate trait? E.g.:

pub trait OwningAlloc: Alloc {
    fn owns(&self, ptr: *mut u8, layout: Layout) -> bool;
}
sfackler commented 7 years ago

Can this be implemented by jemalloc or libc malloc? It seems like a trait that would be defined by the fallback allocator crate.

joshlf commented 7 years ago

I'm not sure. It can be implemented by elfmalloc, and elfmalloc uses a similar design to jemalloc, so my guess is that the answer is yes, but I'm not sure. I'm also not sure about libc malloc.

joshlf commented 7 years ago

@davidtgoldblatt (forgive me for making you my go-to jemalloc person :) ) do you know if jemalloc would be able to answer the question "was this particular object allocated by jemalloc"?

davidtgoldblatt commented 7 years ago

Haha, NP. Yes, we this is straightforward for us to do. I believe we used to expose this functionality publicly, although not anymore. IIRC it's an API requirement of the OS X zone functionality.

I can't speak for sure about glibc malloc, but they keep allocation metadata inline which I imagine would make doing this much harder for them.

davidtgoldblatt commented 7 years ago

FWIW, I don't think I agree with the conclusion of the talk. Metadata lookup is one of the largest costs for allocators that choose to keep it out-of-line (which has a lot of benefits, though I suppose isn't universal). By splitting that across N allocators, you're potentially multiplying that by a factor of N, whereas a single allocator with N of the features you'd want to wrap up (stats collection, profiling, whatever) can usually still only do that once.

davidtgoldblatt commented 7 years ago

And, for this kind of composition strategy to work in the first place, you more or less have to keep it out of line, unless each allocator wraps the one below it instead of merely being composed with it at some higher level of abstraction. And in that case, you still need the allocators to coordinate on their metadata representation, because otherwise your memory layouts will look like

[metadata for allocator 1] [metadata for allocator 2] ... [metadata for allocator N] [padding for alignment] [object],

turning an O(N) lookup issue into an O(N) extra metadata issue.

joshlf commented 7 years ago

I tend to agree with you that the talk is unrealistic on how expensive the ownership lookup is, and the idea that we need it. On the one hand, I agree with the premise that modern allocators are very much constructed as ensembles at a high level, but on the other hand, in practice, that's not a clean abstraction - the layers are broken all the time for performance reasons (this is the case with elfmalloc, from what I can tell it's the case with jemalloc, and it's been the case with every allocator I've read an academic article about).

Also, the ownership concept alone isn't useful unless a given allocator is completely owned by another allocator. For example, I have an implementation of a slab allocator that, by default, uses the heap to allocate slabs. Even if the heap provided an ownership check, I wouldn't be able to use it to implement an ownership check for the slab allocator itself because "the heap owns this pointer" doesn't necessarily imply "the heap owns this pointer and I was the one who allocated it." For that, you need to be the only one allowed to allocate from the heap (or whatever allocator you use under the hood).

steveklabnik commented 5 years ago

Triage: we have now implemented an allocator API, and this was considered and rejected in #49668. As such, I'm gonna give it a close. Thanks!