Spark reducer use netty direct memory when fetching data. Size = executor.memory
Arrow can use arrow-memory-netty or arrow-memory-unsafe, we currently use arrow-memory-unsafe
Netty direct memory actually use the unsafe API(unsafe.allocateMemory(size)) eventually, it's offheap. Wrapped in netty library, controlled by -xmx and --XX:MaxDirectMemorySize
To gluten, reducer still use netty.direct memory to fetch data. So for large cluster we need to
Config large executor.memory OR
Decrease spark.reducer.maxBlocksInFlightPerAddress or spark.reducer.maxReqsInFlight
Config extraJavaOption using --XX:MaxDirectMemorySize, needs to count it into overhead memory
Description