Support shuffle data replica?

Tencent / Firestorm

Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark and Apache Hadoop MapReduce applications to store shuffle data on remote servers

Other

252 stars 72 forks source link

Support shuffle data replica? #181

Closed zuston closed 2 years ago

zuston commented 2 years ago

Thanks for your great work on it.

I have a question that I dont see any info about data replica support in README or other doc. But i see the config in codebase.

So the data replica is stable feature in firestorm?

zuston commented 2 years ago

And what's the difference of the config of data-replica and data-replica-writer ?

frankliee commented 2 years ago

These configs are come from quorum protocol. rss.data.replica is default replica number of partition. rss.data.replica.write is the minimum replica that writer should write metadata and data successfully. rss.data.replica.read is the minimum replica that reader should read metadata successfully (data can read from only one replica). The recommended values are (1,1,1) and (3,2,2), This feature has been developed completely, but production environment only applies (1,1,1) to reduce rpc/memory cost.

zuston commented 2 years ago

Thanks for your explanation @frankliee.

So the data replica will be supported through client pushing instead of shuffle server.

frankliee commented 2 years ago

Thanks for your explanation @frankliee.

So the data replica will be supported through client pushing instead of shuffle server.

Yes, its client-side feature.

zuston commented 2 years ago

Thanks again. @frankliee

Close it.