Allocator traits and std::heap

nikomatsakis commented 8 years ago

📢 This feature has a dedicated working group, please direct comments and concerns to the working group's repo.

The remainder of this post is no longer an accurate summary of the current state; see that dedicated working group instead.

Old content

Original Post: ----- FCP proposal: https://github.com/rust-lang/rust/issues/32838#issuecomment-336957415 FCP checkboxes: https://github.com/rust-lang/rust/issues/32838#issuecomment-336980230 --- Tracking issue for rust-lang/rfcs#1398 and the `std::heap` module. - [x] land `struct Layout`, `trait Allocator`, and default implementations in `alloc` crate (https://github.com/rust-lang/rust/pull/42313) - [x] decide where parts should live (e.g. default impls has dependency on `alloc` crate, but `Layout`/`Allocator` _could_ be in `libcore`...) (https://github.com/rust-lang/rust/pull/42313) - [ ] fixme from source code: audit default implementations (in `Layout` for overflow errors, (potentially switching to overflowing_add and overflowing_mul as necessary). - [x] decide if `realloc_in_place` should be replaced with `grow_in_place` and `shrink_in_place` ([comment](https://github.com/rust-lang/rust/issues/32838#issuecomment-208141759)) (https://github.com/rust-lang/rust/pull/42313) - [ ] review arguments for/against associated error type (see subthread [here](https://github.com/rust-lang/rfcs/pull/1398#issuecomment-204561446)) - [ ] determine what the requirements are on the alignment provided to `fn dealloc`. (See discussion on [allocator rfc](https://github.com/rust-lang/rfcs/pull/1398#issuecomment-198584430) and [global allocator rfc](https://github.com/rust-lang/rfcs/pull/1974#issuecomment-302789872) and [trait `Alloc` PR](https://github.com/rust-lang/rust/pull/42313#issuecomment-306202489).) * Is it required to deallocate with the exact `align` that you allocate with? [Concerns have been raised](https://github.com/rust-lang/rfcs/pull/1974#issuecomment-302789872) that allocators like jemalloc don't require this, and it's difficult to envision an allocator that does require this. ([more discussion](https://github.com/rust-lang/rfcs/pull/1398#issuecomment-198584430)). @ruuda and @rkruppe look like they've got the most thoughts so far on this. - [ ] should `AllocErr` be `Error` instead? ([comment](https://github.com/rust-lang/rust/pull/42313#discussion_r122580471)) - [x] Is it required to deallocate with the *exact* size that you allocate with? With the `usable_size` business we may wish to allow, for example, that you if you allocate with `(size, align)` you must deallocate with a size somewhere in the range of `size...usable_size(size, align)`. It appears that jemalloc is totally ok with this (doesn't require you to deallocate with a *precise* `size` you allocate with) and this would also allow `Vec` to naturally take advantage of the excess capacity jemalloc gives it when it does an allocation. (although actually doing this is also somewhat orthogonal to this decision, we're just empowering `Vec`). So far @Gankro has most of the thoughts on this. (@alexcrichton believes this was settled in https://github.com/rust-lang/rust/pull/42313 due to the definition of "fits") - [ ] similar to previous question: Is it required to deallocate with the *exact* alignment that you allocated with? (See comment from [5 June 2017](https://github.com/rust-lang/rust/pull/42313#issuecomment-306202489)) - [x] OSX/`alloc_system` is buggy on *huge* alignments (e.g. an align of `1 << 32`) https://github.com/rust-lang/rust/issues/30170 #43217 - [ ] should `Layout` provide a `fn stride(&self)` method? (See also https://github.com/rust-lang/rfcs/issues/1397, https://github.com/rust-lang/rust/issues/17027 ) - [x] `Allocator::owns` as a method? https://github.com/rust-lang/rust/issues/44302 State of `std::heap` after https://github.com/rust-lang/rust/pull/42313: ```rust pub struct Layout { /* ... */ } impl Layout { pub fn new() -> Self; pub fn for_value(t: &T) -> Self; pub fn array(n: usize) -> Option; pub fn from_size_align(size: usize, align: usize) -> Option; pub unsafe fn from_size_align_unchecked(size: usize, align: usize) -> Layout; pub fn size(&self) -> usize; pub fn align(&self) -> usize; pub fn align_to(&self, align: usize) -> Self; pub fn padding_needed_for(&self, align: usize) -> usize; pub fn repeat(&self, n: usize) -> Option<(Self, usize)>; pub fn extend(&self, next: Self) -> Option<(Self, usize)>; pub fn repeat_packed(&self, n: usize) -> Option; pub fn extend_packed(&self, next: Self) -> Option<(Self, usize)>; } pub enum AllocErr { Exhausted { request: Layout }, Unsupported { details: &'static str }, } impl AllocErr { pub fn invalid_input(details: &'static str) -> Self; pub fn is_memory_exhausted(&self) -> bool; pub fn is_request_unsupported(&self) -> bool; pub fn description(&self) -> &str; } pub struct CannotReallocInPlace; pub struct Excess(pub *mut u8, pub usize); pub unsafe trait Alloc { // required unsafe fn alloc(&mut self, layout: Layout) -> Result<*mut u8, AllocErr>; unsafe fn dealloc(&mut self, ptr: *mut u8, layout: Layout); // provided fn oom(&mut self, _: AllocErr) -> !; fn usable_size(&self, layout: &Layout) -> (usize, usize); unsafe fn realloc(&mut self, ptr: *mut u8, layout: Layout, new_layout: Layout) -> Result<*mut u8, AllocErr>; unsafe fn alloc_zeroed(&mut self, layout: Layout) -> Result<*mut u8, AllocErr>; unsafe fn alloc_excess(&mut self, layout: Layout) -> Result; unsafe fn realloc_excess(&mut self, ptr: *mut u8, layout: Layout, new_layout: Layout) -> Result; unsafe fn grow_in_place(&mut self, ptr: *mut u8, layout: Layout, new_layout: Layout) -> Result<(), CannotReallocInPlace>; unsafe fn shrink_in_place(&mut self, ptr: *mut u8, layout: Layout, new_layout: Layout) -> Result<(), CannotReallocInPlace>; // convenience fn alloc_one(&mut self) -> Result, AllocErr> where Self: Sized; unsafe fn dealloc_one(&mut self, ptr: Unique) where Self: Sized; fn alloc_array(&mut self, n: usize) -> Result, AllocErr> where Self: Sized; unsafe fn realloc_array(&mut self, ptr: Unique, n_old: usize, n_new: usize) -> Result, AllocErr> where Self: Sized; unsafe fn dealloc_array(&mut self, ptr: Unique, n: usize) -> Result<(), AllocErr> where Self: Sized; } /// The global default allocator pub struct Heap; impl Alloc for Heap { // ... } impl<'a> Alloc for &'a Heap { // ... } /// The "system" allocator pub struct System; impl Alloc for System { // ... } impl<'a> Alloc for &'a System { // ... } ```

Ericson2314 commented 7 years ago

Is there a tracking issue for converting collections? Now that the PRs are merged, I'd like to

Discuss the associated error type
Discuss converting collections to use any local allocator, (especially leveraging associated error type)

alexcrichton commented 7 years ago

I've opened https://github.com/rust-lang/rust/issues/42774 to track integration of Alloc into std collections. With historical discussion in the libs team that's likely to be on a separate track of stabilization than an initial pass of the std::heap module.

alexcrichton commented 7 years ago

While reviewing allocator-related issues I also came across https://github.com/rust-lang/rust/issues/30170 which @pnkfelix awhile back. It looks like the OSX system allocator is buggy with high alignments and when running that program with jemalloc it's segfaulting during deallocation on Linux at least. Worth considering during stabilization!

pnkfelix commented 7 years ago

I've opened #42794 as a place to discuss the specific question of whether zero-sized allocations need to match their requested alignment.

pnkfelix commented 7 years ago

(oh wait, zero-sized allocations are illegal in user allocators!)

SimonSapin commented 7 years ago

Since the alloc::heap::allocate function and friends are now gone in Nightly, I’ve updated Servo to use this new API. This is part of the diff:

-use alloc::heap;
+use alloc::allocator::{Alloc, Layout};
+use alloc::heap::Heap;

-        let ptr = heap::allocate(req_size as usize, FT_ALIGNMENT) as *mut c_void;
+        let layout = Layout::from_size_align(req_size as usize, FT_ALIGNMENT).unwrap();
+        let ptr = Heap.alloc(layout).unwrap() as *mut c_void;

I feel the ergonomics are not great. We went from importing one item to importing three from two different modules.

Would it make sense to have a convenience method for allocator.alloc(Layout::from_size_align(…))?
Would it make sense to make <Heap as Alloc>::_ methods available as free functions or inherent methods? (To have one fewer item to import, the Alloc trait.)

SimonSapin commented 7 years ago

Alternatively, could the Alloc trait be in the prelude or is it too niche of a use case?

eddyb commented 7 years ago

@SimonSapin IMO there isn't much of a point in optimizing the ergonomics of such a low-level API.

joshlf commented 7 years ago

@SimonSapin

I feel the ergonomics are not great. We went from importing one item to importing three from two different modules.

I had the exact same feeling with my codebase - it's pretty clunky now.

Would it make sense to have a convenience method for allocator.alloc(Layout::from_size_align(…))?

Do you mean in the Alloc trait, or just for Heap? One thing to consider here is that there's now a third error condition: Layout::from_size_align returns an Option, so it could return None in addition to the normal errors you can get when allocating.

Alternatively, could the Alloc trait be in the prelude or is it too niche of a use case?

IMO there isn't much of a point in optimizing the ergonomics of such a low-level API.

I agree that it's probably too low-level to put in the prelude, but I still think that there's value in optimizing the ergonomics (selfishly, at least - that was a really annoying refactor 😝 ).

alexcrichton commented 7 years ago

@SimonSapin did you not handle OOM before? Also in std all three types are available in the std::heap module (they're supposed to be in one module). Also did you not handle the case of overflowing sizes before? Or zero-sized types?

SimonSapin commented 7 years ago

did you not handle OOM before?

When it existed the alloc::heap::allocate function returned a pointer without a Result and did not leave a choice in OOM handling. I think it aborted the process. Now I’ve added .unwrap() to panic the thread.

they're supposed to be in one module

I see now that heap.rs contains pub use allocator::*;. But when I clicked Alloc in the impl listed on the rustdoc page for Heap I was sent to alloc::allocator::Alloc.

As to the rest, I haven’t looked into it. I’m porting to a new compiler a big pile of code that was written years ago. I think these are callbacks for FreeType, a C library.

retep998 commented 7 years ago

When it existed the alloc::heap::allocate function returned a pointer without a Result and did not leave a choice in OOM handling.

It did give you a choice. The pointer it returned could have been a null pointer which would indicate the heap allocator failed to allocate. This is why I'm so glad it switched to Result so people don't forget to handle that case.

SimonSapin commented 7 years ago

Oh well, maybe the FreeType ended up doing a null check, I don’t know. Anyway, yes, returning a Result is good.

pnkfelix commented 7 years ago

Given #30170 and #43097, I am tempted to resolve the OS X issue with ridiculously big alignments by simply specifying that users cannot request alignments >= 1 << 32.

One very easy way to enforce this: Change the Layout interface so that align is denoted by a u32 instead of a usize.

@alexcrichton do you have thoughts on this? Should I just make a PR that does this?

SimonSapin commented 7 years ago

@pnkfelix Layout::from_size_align would still take usize and return an error on u32 overflow, right?

pnkfelix commented 7 years ago

@SimonSapin what reason is there to have it continue taking usize align, if a static precondition is that it is unsafe to pass a value >= 1 << 32 ?

pnkfelix commented 7 years ago

and if the answer is "well some allocators might support an alignment >= 1 << 32", then we're back to the status quo and you can disregard my suggestion. The point of my suggestion is basically a "+1" to comments like this one

SimonSapin commented 7 years ago

Because std::mem::align_of returns usize

pnkfelix commented 7 years ago

@SimonSapin ah, good old stable API's... sigh.

alexcrichton commented 7 years ago

@pnkfelix limiting to 1 << 32 seems reasonable to me!

alexcrichton commented 6 years ago

@rfcbot fcp merge

Ok this trait and its types have bake for awhile now and also been the underlying implementation of the standard collections since its inception. I would propose starting out with a particularly conservative initial offering, namely only stabilizing the following interface:

pub struct Layout { /* ... */ }

extern {
    pub type void;
}

impl Layout {
    pub fn from_size_align(size: usize, align: usize) -> Option<Layout>;
    pub unsafe fn from_size_align_unchecked(size: usize, align: usize) -> Layout;

    pub fn size(&self) -> usize;
    pub fn align(&self) -> usize;
}

pub unsafe trait Alloc {
    unsafe fn alloc(&mut self, layout: Layout) -> *mut void;
    unsafe fn alloc_zeroed(&mut self, layout: Layout) -> *mut void;
    unsafe fn dealloc(&mut self, ptr: *mut void, layout: Layout);

    // all other methods are default and unstable
}

/// The global default allocator
pub struct Heap;

impl Alloc for Heap {
    // ...
}

impl<'a> Alloc for &'a Heap {
    // ...
}

/// The "system" allocator
pub struct System;

impl Alloc for System {
    // ...
}

impl<'a> Alloc for &'a System {
    // ...
}

Original proposal

```rust pub struct Layout { /* ... */ } impl Layout { pub fn from_size_align(size: usize, align: usize) -> Option; pub unsafe fn from_size_align_unchecked(size: usize, align: usize) -> Layout; pub fn size(&self) -> usize; pub fn align(&self) -> usize; } // renamed from AllocErr today pub struct Error { // ... } impl Error { pub fn oom() -> Self; } pub unsafe trait Alloc { unsafe fn alloc(&mut self, layout: Layout) -> Result<*mut u8, Error>; unsafe fn dealloc(&mut self, ptr: *mut u8, layout: Layout); // all other methods are default and unstable } /// The global default allocator pub struct Heap; impl Alloc for Heap { // ... } impl<'a> Alloc for &'a Heap { // ... } /// The "system" allocator pub struct System; impl Alloc for System { // ... } impl<'a> Alloc for &'a System { // ... } ```

Notably:

Only stabilizing alloc, alloc_zeroed, and dealloc methods on the Alloc trait for now. I think this solves the most pressing issue we have today, defining a custom global allocator.
Remove the Error type in favor of just using raw pointers.
Change the u8 type in the interface to void
A stripped down version of the Layout type.

There are still open questions such as what to do with dealloc and alignment (precise alignment? fits? unsure?), but I'm hoping we can resolve them during FCP as it likely won't be an API-breaking change.

joshlf commented 6 years ago

+1 to getting something stabilized!

Renames AllocErr to Error and moves the interface to be a bit more conservative.

Does this eliminate the option for allocators to specify Unsupported? At the risk of harping more on something I've been harping on a lot, I think that #44557 is still an issue.

Layout

It looks like you've removed some of the methods from Layout. Did you mean to have the ones you left out actually removed, or just left as unstable?

SimonSapin commented 6 years ago

impl Error {
    pub fn oom() -> Self;
}

Is this a constructor for what’s today AllocErr::Exhausted? If so, shouldn’t it have a Layout parameter?

rfcbot commented 6 years ago

Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged teams:

[x] @BurntSushi
[x] @Kimundi
[x] @alexcrichton
[x] @aturon
[x] @cramertj
[x] @dtolnay
[x] @eddyb
[x] @nikomatsakis
[x] @nrc
[x] @pnkfelix
[x] @sfackler
[x] @withoutboats

Concerns:

*mut u8 resolved by https://github.com/rust-lang/rust/issues/32838#issuecomment-342622672

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

cramertj commented 6 years ago

I'm really excited about getting to stabilize some of this work!

One question: in the above thread, @joshlf and @Ericson2314 raised an interesting point about the possibility of separating the Alloc and Dealloc traits in order to optimize for cases in which alloc requires some data, but dealloc requires no extra information, so the Dealloc type can be zero-sized.

Was this question ever resolved? What are the disadvantages of separating the two traits?

alexcrichton commented 6 years ago

@joshlf

Does this eliminate the option for allocators to specify Unsupported?

Yes and no, it would mean that such an operation is not supported in stable rust immediately, but we could continue to support it in unstable Rust.

Did you mean to have the ones you left out actually removed, or just left as unstable?

Indeed! Again though I'm just proposing a stable API surface area, we can leave all the other methods as unstable. Over time we can continue to stabilize more of the functionality. I think it's best to start as conservative as we can.

@SimonSapin

Is this a constructor for what’s today AllocErr::Exhausted? If so, shouldn’t it have a Layout parameter?

Aha good point! I sort of wanted to leave the possibility though to make Error a zero-sized type if we really needed it, but we can of course keep the layout-taking methods in unstable and stabilize them if necessary. Or do you think that the layout-preserving Error should be stabilized in the first pass?

@cramertj

I hadn't personally seen such a question/concern yet (I think I missed it!), but I wouldn't personally see it as being worth it. Two traits is twice the boilerplate in general, as now everyone would have to type Alloc + Dealloc in collections for example. I would expect that such a specialized use would not want to inform the interface all other users end up using, personally.

joshlf commented 6 years ago

@cramertj @alexcrichton

I hadn't personally seen such a question/concern yet (I think I missed it!), but I wouldn't personally see it as being worth it.

In general I agree that it's not worth it with one glaring exception: Box. Box<T, A: Alloc> would, given the current definition of Alloc, have to be at least two words large (the pointer that it already has and a reference to an Alloc at the very least) except in the case of global singletons (which can be implemented as ZSTs). A 2x (or more) blowup in the space required to store such a common and fundamental type is concerning to me.

cramertj commented 6 years ago

@alexcrichton

as now everyone would have to type Alloc + Dealloc in collections for example

We could add something like this:

trait Allocator: Alloc + Dealloc {}
impl<T> Allocator for T where T: Alloc + Dealloc {}

SimonSapin commented 6 years ago

A 2x (or more) blowup in the space required to store such a common and fundamental type

Only when you use a custom allocator that is not process-global. std::heap::Heap (the default) is zero-size.

SimonSapin commented 6 years ago

Or do you think that the layout-preserving Error should be stabilized in the first pass?

@alexcrichton I don’t really understand why this proposed first pass is at it is at all. There’s barely more than could already be done by abusing Vec, and not enough for example to use https://crates.io/crates/jemallocator.

What still needs to be resolved to stabilize the whole thing?

joshlf commented 6 years ago

Only when you use a custom allocator that is not process-global. std::heap::Heap (the default) is zero-size.

That seems like the primary use case of having parametric allocators, no? Imagine the following simple definition of a tree:

struct Node<T, A: Alloc> {
    t: T,
    left: Option<Box<Node<T, A>>>,
    right: Option<Box<Node<T, A>>>,
}

A tree constructed from those with a 1-word Alloc would have a ~1.7x blowup in size for the whole data structure compared to a ZST Alloc. That seems pretty bad to me, and these kinds of applications are sort of the whole point of having Alloc be a trait.

glaebhoerl commented 6 years ago

@cramertj

We could add something like this:

We're also going to have actual trait aliases :) https://github.com/rust-lang/rust/issues/41517

cramertj commented 6 years ago

@glaebhoerl Yeah, but stabilization still seems a ways off since there isn't an implementation yet. If we disable manual impls of Allocator I think we can switch to trait-aliases backwards-compatibly when they arrive ;)

alexcrichton commented 6 years ago

@joshlf

A 2x (or more) blowup in the space required to store such a common and fundamental type is concerning to me.

I'd imagine all implementations today are just a zero-size type or a pointer large, right? Isn't the possible optimization that some pointer-size-types can be zero sized? (or something like that?)

@cramertj

We could add something like this:

Indeed! We've then taken one trait to three though. In the past we've never had a great experience with such traits. For example Box<Both> does not cast to Box<OnlyOneTrait>. I'm sure we could wait for language features to smooth all this out, but it seems like those are a long way off, at best.

@SimonSapin

What still needs to be resolved to stabilize the whole thing?

I don't know. I wanted to start with the absolute smallest thing so there would be less debate.

joshlf commented 6 years ago

I'd imagine all implementations today are just a zero-size type or a pointer large, right? Isn't the possible optimization that some pointer-size-types can be zero sized? (or something like that?)

Yeah, the idea is that, given a pointer to an object allocated from your your type of allocator, you can figure out which instance it came from (e.g., using inline metadata). Thus, the only information you need to deallocate is type information, not runtime information.

ruuda commented 6 years ago

To come back to alignment on deallocate, I see two ways forward:

Stabilize as proposed (with alignment on deallocate). Giving away ownership of manually allocated memory would be impossible unless the Layout is included. In particular, it is impossible to build a Vec or Box or String or other std container with a stricter alignment than required (for instance because you don't want the boxed element to straddle a cache line), without deconstructing and deallocating it manually later (which is not always an option). Another example of something that would be impossible, is filling a Vec using simd operations, and then giving it away.
Do not require alignment on deallocate, and remove the small-allocation optimization from Windows’ HeapAlloc-based alloc_system. Always store the alignment. @alexcrichton, as you committed that code, do you remember why it was put there in the first place? Do we have any evidence that it saves a significant amount of memory for real-world applications? (With microbenchmarks it is possible to make the results come out either way depending the allocation size — unless HeapAlloc was rounding up sizes anyway.)

In any case, this is a very difficult trade off to make; the memory and performance impact will depend highly on the type of application, and which one to optimize for is application-specific as well.

joshlf commented 6 years ago

I think we may actually be Just Fine (TM). Quoting the Alloc docs:

Some of the methods require that a layout fit a memory block. What it means for a layout to "fit" a memory block means (or equivalently, for a memory block to "fit" a layout) is that the following two conditions must hold:

The block's starting address must be aligned to layout.align().

The block's size must fall in the range [use_min, use_max], where:

use_min is self.usable_size(layout).0, and

use_max is the capacity that was (or would have been) returned when (if) the block was allocated via a call to alloc_excess or realloc_excess.

Note that:

the size of the layout most recently used to allocate the block is guaranteed to be in the range [use_min, use_max], and

a lower-bound on use_max can be safely approximated by a call to usable_size.

if a layout k fits a memory block (denoted by ptr) currently allocated via an allocator a, then it is legal to use that layout to deallocate it, i.e. a.dealloc(ptr, k);.

Note that last bullet. If I allocate with a layout with alignment a, then it should be legal for me to deallocate with alignment b < a because an object which is aligned to a is also aligned to b, and thus a layout with alignment b fits an object allocated with a layout with alignment a (and with the same size).

What this means is that you should be able to allocate with an alignment which is greater than the minimum alignment required for a particular type and then allow some other code to deallocate with the minimum alignment, and it should work.

glaebhoerl commented 6 years ago

Isn't the possible optimization that some pointer-size-types can be zero sized? (or something like that?)

There was an RFC for this recently and it seems very unlikely that it could be done due to compatibility concerns: https://github.com/rust-lang/rfcs/pull/2040

For example Box<Both> does not cast to Box<OnlyOneTrait>. I'm sure we could wait for language features to smooth all this out, but it seems like those are a long way off, at best.

Trait object upcasting on the other hand seems uncontroversially desirable, and mostly a question of effort / bandwidth / willpower to get it implemented. There was a thread recently: https://internals.rust-lang.org/t/trait-upcasting/5970

retep998 commented 6 years ago

@ruuda I was the one who wrote that alloc_system implementation originally. alexcrichton merely moved it around during the great allocator refactors of <time period>.

The current implementation requires that you deallocate with the same alignment specified that you allocated a given memory block with. Regardless of what the documentation may claim, this is the current reality that everyone must abide by until alloc_system on Windows is changed.

Allocations on Windows always use a multiple of MEMORY_ALLOCATION_ALIGNMENT (although they remember the size you allocated them with to the byte). MEMORY_ALLOCATION_ALIGNMENT is 8 on 32bit and 16 on 64bit. For overaligned types, because the alignment is greater than MEMORY_ALLOCATION_ALIGNMENT, the overhead caused by alloc_system is consistently the amount of alignment specified, so a 64byte aligned allocation would have 64 bytes of overhead.

If we decided to extend that overaligned trick to all allocations (which would get rid of the requirement to deallocate with the same alignment that you specified when allocating), then more allocations would have overhead. Allocations whose alignments are identical to MEMORY_ALLOCATION_ALIGNMENT will suffer a constant overhead of MEMORY_ALLOCATION_ALIGNMENT bytes. Allocations whose alignments are less than MEMORY_ALLOCATION_ALIGNMENT will suffer an overhead of MEMORY_ALLOCATION_ALIGNMENT bytes approximately half of the time. If the size of the allocation rounded up to MEMORY_ALLOCATION_ALIGNMENT is greater than or equal to the size of the allocation plus the size of a pointer, then there is no overhead, otherwise there is. Considering 99.99% of allocations will not be overaligned, do you really want to incur that sort of overhead on all those allocations?

alexcrichton commented 6 years ago

@ruuda

I personally feel that the implementation of alloc_system today on Windows is a bigger benefit than having the ability to relinquish ownership of an allocation to another container like Vec. AFAIK though there's no data to measure the impact of always padding with the alignment and not requiring an alignment on deallocation.

@joshlf

I think that comment is wrong, as pointed out alloc_system on Windows relies on the same alignment being passed to deallocation as was passed on allocation.

ruuda commented 6 years ago

Considering 99.99% of allocations will not be overaligned, do you really want to incur that sort of overhead on all those allocations?

It depends on the application whether the overhead is significant, and whether to optimize for memory or performance. My suspicion is that for most applications either is fine, but a small minority cares deeply about memory, and they really cannot afford those extra bytes. And another small minority needs control over alignment, and they really need it.

joshlf commented 6 years ago

@alexcrichton

I think that comment is wrong, as pointed out alloc_system on Windows relies on the same alignment being passed to deallocation as was passed on allocation.

Doesn't that imply that alloc_system on Windows doesn't actually properly implement the Alloc trait (and thus maybe we ought to change the requirements of the Alloc trait)?

@retep998

If I'm reading your comment correctly, isn't that alignment overhead present for all allocations regardless of whether we need to be able to deallocate with a different alignment? That is, if I allocate 64 bytes with 64 byte alignment and I also deallocate with 64 byte alignment, the overhead you described is still present. Thus, it's not a feature of being able to deallocate with different alignments so much as it is a feature of requesting larger-than-normal alignments.

retep998 commented 6 years ago

@joshlf The overhead caused by alloc_system currently is due to requesting larger-than-normal alignments. If your alignment is less than or equal to MEMORY_ALLOCATION_ALIGNMENT, then there is no overhead caused by alloc_system.

However, if we changed the implementation to allow deallocating with different alignments, then the overhead would apply to nearly all allocations, regardless of alignment.

joshlf commented 6 years ago

Ah I see; makes sense.

dtolnay commented 6 years ago

What is the meaning of implementing Alloc for both Heap and &Heap? In what cases would the user use one of those impls vs the other?

Is this the first standard library API in which *mut u8 would mean "pointer to whatever"? There is String::from_raw_parts but that one really does mean pointer to bytes. I am not a fan of *mut u8 meaning "pointer to whatever" -- even C does better. What are some other options? Maybe a pointer to opaque type would be more meaningful.

@rfcbot concern *mut u8

alexcrichton commented 6 years ago

@dtolnay Alloc for Heap is sort of "standard" and Alloc for &Heap is like Write for &T where the trait requires &mut self but the implementation does not. Notably that means that types like Heap and System are threadsafe and do not need to be synchronized when allocating.

More importantly, though, usage of #[global_allocator] requires that the static it's attached to , which has type T, to have Alloc for &T. (aka all global allocators must be threadsafe)

For *mut u8 I think that *mut () may be interesting, but I don't personally feel too compelled to "get this right" here per se.

Amanieu commented 6 years ago

The main advantage of *mut u8 is that it is very convenient to use .offset with byte offsets.

joshlf commented 6 years ago

For *mut u8 I think that *mut () may be interesting, but I don't personally feel too compelled to "get this right" here per se.

If we go with *mut u8 in a stable interface, aren't we locking ourselves in? In other words, once we stabilize this, we won't have a chance to "get this right" in the future.

Also, *mut () seems a bit dangerous to me in case we ever make an optimization like RFC 2040 in the future.

joshlf commented 6 years ago

The main advantage of *mut u8 is that it is very convenient to use .offset with byte offsets.

True, but you could easily do let ptr = (foo as *mut u8) and then go about your merry way. That doesn't seem like enough of a motivation to stick with *mut u8 in the API if there are compelling alternatives (which, to be fair, I'm not sure there are).

cramertj commented 6 years ago

Also, *mut () seems a bit dangerous to me in case we ever make an optimization like RFC 2040 in the future.

That optimization will already probably never happen-- it'd break too much existing code. Even if it did, it would be applied to &() and &mut (), not *mut ().

rust-lang / rust

Allocator traits and std::heap #32838