apache / incubator-uniffle

Uniffle is a high performance, general purpose Remote Shuffle Service.
https://uniffle.apache.org/
Apache License 2.0
387 stars 149 forks source link

[FEATURE] Split one huge event into multi small events to improve HDFS flush performance #2242

Open zuston opened 1 week ago

zuston commented 1 week ago

Code of Conduct

Search before asking

Describe the feature

In current codebase, for huge partition, before it marked as huge partition,it will be reserved in the memory if having enough capacity. But when it is marked as huge partition, then it should be flushed into the HDFS, if this is specified.

In this first flushing process of this huge partition, it sometimes will be large, especially with the huge buffer capacity. And this will be slow because it is a huge flush event, which is not benifited from the concurrency hdfs partition writing mechanism. And it will occupy memory space before this flush is finished, and will make the client backpressure.

From this point, the smaller flush event is better for shuffle-server throughout. But the local IO hope the big flush data buffer, which is a trade off.

Anyway, the huge partition huge flush event splited into multi small events to improve writing performance is useful.

Motivation

No response

Describe the solution

No response

Additional context

No response

Are you willing to submit PR?