apache / gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
https://gravitino.apache.org
Apache License 2.0
1.1k stars 344 forks source link

[#5585] fix(catalog-hadoop): Test and make fileset with cloud storage can work with Spark 3.2.0~3.5.3 #5630

Open yuqi1129 opened 4 days ago

yuqi1129 commented 4 days ago

What changes were proposed in this pull request?

  1. Update the Hadoop version from 3.3.0 to 3.3.1 to avoid bugs existing in the Hadoop 3.3.0, why use 3.3.1 and 3.3.6, because version hadoop-aws 3.3.6 is a very updated version and needs the corresponding Hadoop version, which will make it difficult to use in production.
  2. Replace dependencies hadoop-common and hadoop-client with hadoop-client-api and hadoop-client-runtime to avoid third-party dependencies compatibility issues.

Why are the changes needed?

To make fileset that can be used in production.

Fix: #5585

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

Locally and existing UTs and ITs.