facebookincubator / velox

A composable and fully extensible C++ execution engine library for data management systems.
https://velox-lib.io/
Apache License 2.0
3.51k stars 1.15k forks source link

Large memory allocations which are not tracked #11099

Open FelixYBW opened 1 month ago

FelixYBW commented 1 month ago

Description

In Gluten there is one common issue is "killed by yarn", the root cause is that some memory allocation usually the std::vector which bypasses memory pool track. Some of the std::vector can be very large like a commonly used per row std::vector<char*>. If row size is 1G, the vector size can be as large as 8G.

The ideal solution is to avoid the per row vector as much as possible. If we have to use it's better to track in memory pool by using std::vector<char, memory::StlAllocator<char>> and std::allocated_shared.

Here is a umbrella track of such allocations. Link to Gluten issue: https://github.com/apache/incubator-gluten/issues/6947

FelixYBW commented 1 month ago

11077 remove the per row vector allocation

FelixYBW commented 1 month ago

https://github.com/facebookincubator/velox/blob/9e1280a7223fa324b8307d379605bcfc89f8447a/velox/exec/SortBuffer.cpp#L121

Per row memory allocation. Sorted row is passed to std::sort. Looks we can't assign allocator

FelixYBW commented 1 month ago

https://github.com/facebookincubator/velox/blob/9e1280a7223fa324b8307d379605bcfc89f8447a/velox/exec/SortBuffer.cpp#L320

Potential per row allocation. Spill is already triggered, we should allocate it using spill_memorypool.