Open jhorstmann opened 1 year ago
In our product we are using such custom allocators to track the memory usage of individual queries. I'd like to add a similarly named feature to arrow-rs which would generalize the Buffer::from_vec and MutableBuffer::from_vec functions.
As @tustvold , @waynexia and I are discovering on https://github.com/apache/arrow-rs/pull/6336, adding the APIs to Buffer and MutableBuffer is just the start -- to really achieve the usecase I think we would need to preserve the allocation information through all the various kernels / transformations that arrow-rs provides.
Also, @haohuaijin offers another potential usecase for this feature that is "accurately track total memory used by multiple arrow arrays that may share the same underlying Buffers (e.g. that were sliced, etc) in https://github.com/apache/arrow-rs/issues/6439
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
in #3920 and #3917 support was added to to create buffers from standard Rust vectors. The currently instable
allocator_api
feature extendsVec
to support custom allocators, using functions such asnew_in
. In our product we are using such custom allocators to track the memory usage of individual queries. I'd like to add a similarly named feature to arrow-rs which would generalize theBuffer::from_vec
andMutableBuffer::from_vec
functions.Describe the solution you'd like
allocator-api
toarrow-buffer
from_vec
would be enabled (via cfg attribute), which has a generic parameter for the allocatorDeallocation::Standard
would additionally store the allocator (note the defaultGlobal
allocator is zero-sized)Describe alternatives you've considered
Something similar can be achieved using
Buffer::from_custom_allocation
but requires unsafe and dealing with pointers.Additional context