Open mrboojum opened 5 months ago
Well, a slightly kludgy solution is to invoke all those functions one by one :-) (they are extremely cheap)
But what I don't understand is why you would to statically initialize all these, if your memory allocator is NUMA-aware anyway.
Thanks for the quick response. Do I understand correctly that arrow doesn't have a function/method for initializing the static variables?
Doing this on the application side is indeed not desirable due to not being able to check completeness of all allocations for static variables and maintainability. Regarding completeness we now have the list below, can you indicate if its complete (we don't detect any issues anymore)?
Regarding the memory allocation in general I have some more questions: Is it possible/common practice to provide your own subclass of MemoryPool to pass to the API methods (assuming its passed on)?
Doing this on the application side is indeed not desirable due to not being able to check completeness of all allocations for static variables and maintainability.
What do you mean with "check completeness" exactly?
Is it possible/common practice to provide your own subclass of MemoryPool to pass to the API methods (assuming its passed on)?
Not very common, but metadata allocations (such as data types) go directly to the standard C++ allocator anyway.
What do you mean with "check completeness" exactly? -> With completeness I mean that we don't know if the above list of function calls will invoke all allocations for static variables in the arrow library. This implies that if somewhere in the application arrow functionality might trigger an allocation for a different static variable (not in the list above yet) this results in heap leaks/cross NUMA memory access.
"but metadata allocations (such as data types) go directly to the standard C++ allocator anyway." Standard c++ allocators will invoke new which is overloaded by the app. The reason I ask about providing an implementation of MemoryPool is twofold:
TLDR: How can I trigger static initialization of the arrow C++ library?
Context: Using apache arrow in a c++ application running on large multi NUMA machines. The application does have its own heap framework (delegating actual allocation to je_malloc/OS heaps) to prevent/minimize cross NUMA access. We would like to be able to trigger all allocations needed for static variables on a specific heap.
Example: Is there a general way to trigger all allocations needed for static variables like the ones below?
Component(s)
C++