microsoft / mimalloc

mimalloc is a compact general purpose allocator with excellent performance.
MIT License
10.58k stars 865 forks source link

Return usable allocation size on allocation #2

Open gnzlbg opened 5 years ago

gnzlbg commented 5 years ago

jemalloc provides smallocx (e.g. see implementation here: https://github.com/jemalloc/jemalloc/blob/dev/src/jemalloc.c#L3015), which not only returns a pointer to the allocation, but also the size of the allocation size class (that is , the usable allocation size).

It would be nice if for all allocating and resizing operations a similar API could be provided, so that users that might want to use all usable space don't have to query the size class.

daanx commented 5 years ago

Thanks for the suggestion! Currently, mimalloc provides two functions for this, mi_usable_size and mi_good_size, see: https://github.com/microsoft/mimalloc/blob/master/include/mimalloc.h#L107

I think it is better to have these as separate functions to avoid duplicating all allocation API functions.

Soon, I'll have the full doxygen docs online that gives an easier overview of the whole API Thanks!

gnzlbg commented 5 years ago

Yes, I'm currently using mi_good_size followed by an allocation function for that.

This means that mimalloc allocation functions are only called with "good" sizes. If mimalloc ever supports reporting statistics, e.g., to allow users to tune size classes, the information reported for my applications would be useless, because they would all appear to be using the size classes perfectly.

Also, with the mi_good_size + allocation API approach, the size class of the allocation is computed twice, once in mi_good_size, and another time inside the allocation API. This isn't significant, but is unnecessary.

One way to solve this is to not expose a mi_good_size function, but instead, have all allocation APIs return both the pointer to the allocation and the allocation size. Most ABIs provide two return registers, so if the allocation function computes the size class, returning it alongside the pointer to the allocation does not really incur a cost. If for whatever reason an execution path within the allocation API does not need to compute the size class, one can always conservatively return the requested size.

gnzlbg commented 5 years ago

P0901 provides another view on this issue. It makes IMO a strong case against providing either mi_usable_size or mi_good_size.

daanx commented 5 years ago

Nice reference, and a good argument. I am worried about how many extra versions of each allocation function we get though; mimalloc is supposed to stay "lean and mean" so I am hesitant on this issue and need to consider it further.

gnzlbg commented 5 years ago

Completely understand, take your time and gather feedback from others before deciding on this.

One way to keep the API lean, is to not provide two versions of each allocation function in the "Extended" API, one returning void*, and one returning void*,size_t, but instead to only provide functions returning void*,size_t. This should be ok as long as the API returning void*,size_t does not impose a cost over the one that only returns void*. This allows users to just discard the size_t if they are not interested on it.

I don't know how hard it is to make sure that returning two words does not incur a performance impact over just returning one word. AFAICT, as long as the API has to compute the "good size" internally any ways, it shouldn't have an impact, at least on all ABIs that allow for two return registers. On ABIs that do not support this, returning the two words might require passing the struct via memory. I don't know of any ABI that works like this, so I can't tell you how much of a difference this makes.

gnzlbg commented 4 years ago

Any update on this ?