chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.76k stars 414 forks source link

Implement and use sized deallocation #5273

Open ronawho opened 7 years ago

ronawho commented 7 years ago

Our memory interface exposes the standard C malloc/calloc/realloc/aligned_alloc/free routines with the addition of a good_alloc_size routine.

We should explore supporting sized deallocation as well, since it can provide non-trivial performance benefits as noted in the C++ proposal N3778.

I also think this SO post is a decent intro for why sized deallocation can improve performance

Supporting sized deallocation would require switching over to using jemalloc's extended API instead of the standard one. The main disadvantage of the extended API is that it has undefined behavior for size 0 allocations/reallocations and NULL ptr reallocations/frees, so we would have to check for those conditions ourselves.

Note that for 0 sized allocations, most malloc implementations will give you a minimum size allocation. jemalloc does something like

if (size == 0) {
  size = 1;
}

and the libc malloc has a comment that says

// Even a request for zero bytes (i.e., malloc(0)) returns a
// pointer to something of the minimum allocatable size.

Doing something similar in our shim would simplify the changes required.

And beyond changing our mem interface, we'd need the compiler to emit calls to the sized deallocation routines when it can determine the size at compilation time. Before digging into this, we should probably see how much of a performance benefit we'd see. We could hack binary trees to see the potential performance benefits.

ronawho commented 3 years ago

https://github.com/chapel-lang/chapel/pull/16110 converted over to the extended API, so this should be pretty straightforward to add to the runtime interface now.

To avoid correctness issues from getting the size wrong, I think we should get https://github.com/chapel-lang/chapel/issues/13661 done first.