Open zhztheplayer opened 1 month ago
The global memory allocation is used for spill's compression buffer. The memory is counted into executor.overhead memory in Spark. We can do the same for Gluten.
In spark, the overhead memory includes reducer's netty memory, compression buffer memory, the memory native library allocated.
In Gluten the overhead memory includes reducer's netty memory, memory using global allocator, all the std::container's memory.
@marin-ma Is the compression buffer retained in shuffle? if not we can use the global memory allocator to allocate it.
It's observed that Velox backend uses more memory than we configured. Which is perhaps related to the untracked Velox global memory manager.
We should set a capacity according to Spark overhead size to here to limit that memory manager's usage.