apache / parquet-java

Apache Parquet Java
https://parquet.apache.org/
Apache License 2.0
2.65k stars 1.41k forks source link

GH-2943: Remove hadoop-2 support #3061

Open steveloughran opened 1 week ago

steveloughran commented 1 week ago

Rationale for this change

As discussed on the mailing list; hadoop 3.3.0 is the minimum support hadoop version for 1.15

What changes are included in this PR?

pom changes

Are these changes tested?

locally, yes.

Are there any user-facing changes?

The changed minimum hadoop version needs to go into the release notes

Closes #2943

Fokko commented 6 days ago

@steveloughran I think we want to include the Hadoop 2 Github Actions as well

steveloughran commented 6 days ago

I think we want to include the Hadoop 2 Github Actions as well

I wanted to see what what happened there. Let me work on that.

steveloughran commented 6 days ago

Looking at this I note that thift is download from apache archives and built every time. Apache infra might be unhappy about this; archive isn't replicated the way others are and they have "expressed concerns" about it being used elsewhere.

If he CI target OS/cpu is the same across builds, what about just sticking a precompiled thrift binary into dev/thift; if the cpu varies then it could be split for x86-64 and arm with the path set up appropriately.

steveloughran commented 2 days ago

hadoop 3 builds are all happy; no hadoop 2 build any more

steveloughran commented 2 days ago

no, vector failed. draft again

steveloughran commented 13 hours ago

@Fokko thanks; merge when ready