[Improvement] Introduce local allocation buffer to store blocks in memory

xianjingfeng commented 5 months ago

Code of Conduct

[X] I agree to follow this project's Code of Conduct

Search before asking

[X] I have searched in the issues and found no similar issues.

What would you like to be improved?

Currently we have put the shuffle data into the off-heap memory in shuffle server . But I found it still occupancy a lot of heap memory. The following is the result of printing by using jmap -histo.

   1:     189601376    16684921088  io.netty.buffer.UnpooledByteBufAllocator$InstrumentedUnpooledUnsafeDirectByteBuf
   2:     189860728    15188858240  java.nio.DirectByteBuffer (java.base@11.0.1)
   3:     189605871    13651622712  jdk.internal.ref.Cleaner (java.base@11.0.1)
   4:     189018520    10585037120  org.apache.uniffle.common.ShufflePartitionedBlock
   5:     189605871     7584234840  java.nio.DirectByteBuffer$Deallocator (java.base@11.0.1)

From the above results, we can see that the main reason for high memory usage is that there are too many blocks. And the reason why there are so many blocks is because the blocks are very small.

How should we improve?

Introduce local allocation buffer like MSLAB in Hbase. Refer: https://hbase.apache.org/book.html#gcpause

Are you willing to submit PR?

[X] Yes I am willing to submit a PR!

xianjingfeng commented 5 months ago

@jerqi @zuston @advancedxy @rickyma PTAL. I'm quite busy recently. If anyone interested in it, welcome to pick it up.

rickyma commented 5 months ago

This issue seems feasible. I'll take a look first. We need this too.

Currently, there are a few things that we can do to make blocks smaller:

Set spark.rss.writer.buffer.spill.size to a higher value to make blocks larger, e.g. 1g or 2g.
Set rss.client.memory.spill.ratio less than 0.5, e.g. 0.3, let larger blocks spill first.
Set spark.rss.writer.buffer.size to a larger value refer to https://github.com/apache/incubator-uniffle/issues/1594#issuecomment-2081378887, e.g. 10m.

apache / incubator-uniffle