apache / incubator-uniffle

Uniffle is a high performance, general purpose Remote Shuffle Service.
https://uniffle.apache.org/
Apache License 2.0
376 stars 145 forks source link

Support lower Hadoop versions in client-mr #160

Open zuston opened 2 years ago

zuston commented 2 years ago

Currently, uniffle use the default Hadoop version of 2.8.5.

When using the ./build_distribution.sh --spark2-profile 'spark2' --spark3-mvn '-Dspark.version=2.4.3' --spark3-profile 'spark3' --spark3-mvn '-Dspark.version=3.1.1' -Dhadoop.version=2.6.0, it will throw exceptions due to some methods and vars not supported in Hadoop 2.6.0.

Some non-compatible params and methods as follows

  1. CallContext, introduced by >= 2.8.0.
  2. MRJobConfig.DEFAULT_SHUFFLE_MERGE_PERCENT introduced by 2.8.0. ticket link
  3. MRApps.getSystemPropertiesToLog introduced by 2.8.0 ticket link

I think we could use the reflection to be compatible with lower hadoop version.

zuston commented 2 years ago

cc @jerqi @frankliee

zuston commented 2 years ago

And I also found the client-mr is not compatible with Hadoop 3.2.2

jerqi commented 2 years ago

What's your company's hadoop version?

zuston commented 2 years ago
  1. 2.6.0-cdh5.11.0
  2. Hadoop-3.2.2 packaged by bigtop
jerqi commented 2 years ago

I think it's ok for me if we need it in our production environment.

zuston commented 2 years ago

OK. I will go ahead.