apache / orc

Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
https://orc.apache.org/
Apache License 2.0
688 stars 483 forks source link

ORC-1608: Upgrade Hadoop to 3.4.0 #1783

Closed dongjoon-hyun closed 7 months ago

dongjoon-hyun commented 9 months ago

What changes were proposed in this pull request?

This PR aims to upgrade Apache Hadoop dependency to 3.4.0.

Why are the changes needed?

To use the latest Hadoop features.

How was this patch tested?

Pass the CIs.

Was this patch authored or co-authored using generative AI tooling?

No.

dongjoon-hyun commented 9 months ago

It seems okay except Javadoc.

dongjoon-hyun commented 8 months ago

This is WIP because this PR is testing Apache Hadoop 3.4.0 RC2.

https://lists.apache.org/thread/cp2fc9w334rkd7ko9dwsodnhfg1sqxhs

dongjoon-hyun commented 8 months ago

It seems to be deleted already for RC3.

Screenshot 2024-03-04 at 13 36 06

dongjoon-hyun commented 7 months ago

This is WIP because this PR is testing Apache Hadoop 3.4.0 RC3.

dongjoon-hyun commented 7 months ago

For the record, I casted +1 for Hadoop 3.4.0 RC3.

dongjoon-hyun commented 7 months ago

Vote passed.

dongjoon-hyun commented 7 months ago

Thank you, @cxzl25 . This PR is still waiting for Apache Hadoop 3.4.0 API doc. It's not published yet.

dongjoon-hyun commented 7 months ago

Javadoc generation passed.

Screenshot 2024-03-19 at 09 45 33

Merged to main for Apache ORC 2.1.0.

dongjoon-hyun commented 7 months ago

For the record, I'm also working in the Apache Spark community with the following.

If needed, we are able to backport this to branch-2.0 and make Apache ORC 2.0.1 release.

dongjoon-hyun commented 7 months ago

cc @wgtmac and @williamhyun , too.