apache / incubator-uniffle

Uniffle is a high performance, general purpose Remote Shuffle Service.
https://uniffle.apache.org/
Apache License 2.0
387 stars 149 forks source link

[#2219] feat: Introduce ShuffleBlockIdManagerFactory and PartitionedShuffleBlockIdManager #2227

Closed maobaolong closed 4 days ago

maobaolong commented 3 weeks ago

What changes were proposed in this pull request?

Why are the changes needed?

Fix: #2219

image

This is the maximum partition num with node metrics in our cluster, it is ~410K, it cost heap size 410K * 50KiB = 2GiB if we say a bitmap related a partition avg 10000 blocks(cost 50KiB heap).

I think this is worth to choose this new policy in our production cluster.

Does this PR introduce any user-facing change?

How was this patch tested?

Test Locally, check result by arthas

[arthas@264]$ vmtool --action getInstances --className org.apache.uniffle.server.ShuffleTaskInfo --express 'instances[0].shuffleBlockIdManager.partitionsToBlockIds.get("application_1729845342052_431031_1730297736664") ' @ConcurrentHashMapForJDK8[ @Integer[0]:@Roaring64NavigableMap[][isEmpty=false;size=10], @Integer[1]:@Roaring64NavigableMap[][isEmpty=false;size=10], @Integer[2]:@Roaring64NavigableMap[][isEmpty=false;size=10], ]

- Config to PartitionedShuffleBlockIdManager

[arthas@264]$ vmtool --action getInstances --className org.apache.uniffle.server.ShuffleTaskInfo --express 'instances[0].shuffleBlockIdManager' @PartitionedShuffleBlockIdManager[ LOG=@Log4jLogger[org.apache.logging.slf4j.Log4jLogger@6aa6fea0], partitionsToBlockIds=@ConcurrentHashMap[isEmpty=false;size=1], ]

[arthas@264]$ vmtool --action getInstances --className org.apache.uniffle.server.ShuffleTaskInfo --express 'instances[0].shuffleBlockIdManager.partitionsToBlockIds' @ConcurrentHashMap[ @String[application_1729845342052_432324_1730299397916]:@ConcurrentHashMapForJDK8[isEmpty=false;size=2], ] [arthas@264]$ vmtool --action getInstances --className org.apache.uniffle.server.ShuffleTaskInfo --express 'instances[0].shuffleBlockIdManager.partitionsToBlockIds.get("application_1729845342052_432324_1730299397916")' @ConcurrentHashMapForJDK8[ @Integer[0]:@ConcurrentHashMap[isEmpty=false;size=2000], @Integer[1]:@ConcurrentHashMap[isEmpty=false;size=2000], ]

- Use client app level config
submit two app with `spark.rss.client.blockIdStrategyClass=org.apache.uniffle.server.block.DefaultShuffleBlockIdManager` and `spark.rss.client.blockIdStrategyClass=org.apache.uniffle.server.block.PartitionedShuffleBlockIdManager`

[arthas@264]$ vmtool --action getInstances --className org.apache.uniffle.server.ShuffleTaskInfo --express 'instances[0].shuffleBlockIdManager' @DefaultShuffleBlockIdManager[ LOG=@Log4jLogger[org.apache.logging.slf4j.Log4jLogger@ed3b52a], partitionsToBlockIds=@ConcurrentHashMapForJDK8[isEmpty=false;size=1], ] [arthas@264]$ vmtool --action getInstances --className org.apache.uniffle.server.ShuffleTaskInfo --express 'instances[1].shuffleBlockIdManager' @PartitionedShuffleBlockIdManager[ LOG=@Log4jLogger[org.apache.logging.slf4j.Log4jLogger@5fc14628], partitionsToBlockIds=@ConcurrentHashMap[isEmpty=false;size=1], ]


Log example

[2024-10-30 23:01:37.585] [Grpc-709] [INFO] ShuffleTaskInfo - application_1729845342052_432387_1730300440568 use app configured ShuffleBlockIdManager to org.apache.uniffle.server.block.DefaultShuffleBlockIdManager@3c6b75a8

github-actions[bot] commented 3 weeks ago

Test Results

 2 956 files  ±0   2 956 suites  ±0   6h 11m 47s ⏱️ + 1m 25s  1 092 tests ±0   1 090 ✅ ±0   2 💤 ±0  0 ❌ ±0  13 685 runs  ±0  13 655 ✅ ±0  30 💤 ±0  0 ❌ ±0 

Results for commit 51e09249. ± Comparison against base commit 4eebbd46.

:recycle: This comment has been updated with latest results.

jerqi commented 3 weeks ago

Should it become an applicationl evel config option?

maobaolong commented 3 weeks ago

Should it become an application level config option?

@jerqi Sorry for the late reply.

It could be better if it become an app level request option. But It could be a little complex as follow

Could you explain the benefit?

jerqi commented 3 weeks ago

For some little shuflfe, PartionedShuffleBlock may have worse effect. Like mixed load, there are many little shuffle.

jerqi commented 3 weeks ago

Should it become an application level config option?

@jerqi Sorry for the late reply.

It could be better if it become an app level request option. But It could be a little complex as follow

  • Config this option in client conf. (Just list to here, this is easy to do)
  • Make ShuffleBlockIdManager as a member of ShuffleTaskInfo which created follow the config from registerShuffleRequest

Could you explain the benefit?

We can have a ShuffleBlockIdManager, but ShuffleTaskInfo has its BlockId strategy.

maobaolong commented 3 weeks ago

@jerqi Thanks for your suggestion, how about this commit? https://github.com/apache/incubator-uniffle/pull/2227/commits/74373805c2da970e2bedc7da0d15be72718b9c96

maobaolong commented 3 weeks ago

@jerqi Update the name of config key, PTAL

EnricoMi commented 2 weeks ago

In code, there is the term "manager", while in (user-facing) config there is the term "strategy". I think there should be one term. I'd prefer "manager".

maobaolong commented 2 weeks ago

@EnricoMi Thanks for your suggestion, in fact this is my original term I used for this PR, I'm open to change to strategy or revert to manager.

jerqi commented 2 weeks ago

In code, there is the term "manager", while in (user-facing) config there is the term "strategy". I think there should be one term. I'd prefer "manager".

It's better to have one term. My concern about manager is that every TaskInfo will have a manager. In my view, we would better not have the same managers in a server. WDYT?

EnricoMi commented 2 weeks ago

I don't see a problem having different types of manager in the server. The name ShuffleTaskManager clearly says this manager is about shuffle tasks, whereas the ShuffleBlockIdManager manages shuffle block ids.

I'd prefer "Manager" over "Strategy".

jerqi commented 2 weeks ago

I don't see a problem having different types of manager in the server. The name ShuffleTaskManager clearly says this manager is about shuffle tasks, whereas the ShuffleBlockIdManager manages shuffle block ids.

I'd prefer "Manager" over "Strategy".

cc @maobaolong I am ok if both of you prefer manager.

maobaolong commented 2 weeks ago

@jerqi @EnricoMi Thanks for your discussion and suggestion, I updated the PR, PTAL.

maobaolong commented 6 days ago

@jerqi Thanks for your comment, updated the description and docs, PTAL.

maobaolong commented 4 days ago

@jerqi Thanks for your review. merge it.