Apache Uniffle (incubating) is a general remote shuffle service. It can benefit the jobs using volcano in these situations.
For environments, all of Spark, MR, Tez lack a stand-alone shuffle service, which can lead to potential data loss. Uniffle can help enhance the stability of Spark, MR/Tez in such cloud-native environments.
For large Shuffle operations with severe random IO, Uniffle can improve the performance and stability of large Shuffle jobs by aggregating small Shuffle data from upstream map tasks, effectively transforming random IO into sequential IO.
For the separation of compute and storage, Uniffle can reduce the disk dependency of compute node.
What would you like to be added:
Add the Apache Uniffle (incubating) as a part of Ecosystem. Our repo is https://github.com/apache/incubator-uniffle More details https://uniffle.apache.org/blog/2023/07/21/Uniffle%20-%20New%20chapter%20for%20the%20shuffle%20in%20the%20cloud%20native%20era
Why is this needed:
Apache Uniffle (incubating) is a general remote shuffle service. It can benefit the jobs using volcano in these situations.