Open ronawho opened 7 years ago
https://github.com/chapel-lang/chapel/pull/16110 converted over to the extended API, so this should be pretty straightforward to add to the runtime interface now.
To avoid correctness issues from getting the size wrong, I think we should get https://github.com/chapel-lang/chapel/issues/13661 done first.
Our memory interface exposes the standard C malloc/calloc/realloc/aligned_alloc/free routines with the addition of a good_alloc_size routine.
We should explore supporting sized deallocation as well, since it can provide non-trivial performance benefits as noted in the C++ proposal N3778.
I also think this SO post is a decent intro for why sized deallocation can improve performance
Supporting sized deallocation would require switching over to using jemalloc's extended API instead of the standard one. The main disadvantage of the extended API is that it has undefined behavior for size 0 allocations/reallocations and NULL ptr reallocations/frees, so we would have to check for those conditions ourselves.
Note that for 0 sized allocations, most malloc implementations will give you a minimum size allocation. jemalloc does something like
and the libc malloc has a comment that says
Doing something similar in our shim would simplify the changes required.
And beyond changing our mem interface, we'd need the compiler to emit calls to the sized deallocation routines when it can determine the size at compilation time. Before digging into this, we should probably see how much of a performance benefit we'd see. We could hack binary trees to see the potential performance benefits.