apache / parquet-java

Apache Parquet Java
https://parquet.apache.org/
Apache License 2.0
2.65k stars 1.41k forks source link

GH-3007: Ensure version specific Jackson classes are shaded #3008

Closed ted-jenks closed 2 months ago

ted-jenks commented 2 months ago

Rationale for this change

The current jar contains version specific classes under META-INF/versions/... with unshaded package names. This leads to clashes with jackson and thus classpath duplicates. This is a major problem in many big projects right now (e.g. Spark).

What changes are included in this PR?

Ensure these classes are shaded.

Are these changes tested?

Are there any user-facing changes?

Closes #3007

ted-jenks commented 2 months ago

cc @wgtmac hey this is a pretty big issue for us. It is preventing us from taking other fixes. There might be a smarter way to do this but just wanted to kick this off - thanks 😄

wgtmac commented 2 months ago

@gszadovszky @Fokko Could you please help review this? Thanks!

gszadovszky commented 2 months ago

@ted-jenks, have you tried this one? Otherwise, I am fine with the current solution. My only fear is we'll forget to update with the new java versions. It would be nice if we could handle it with some automatism.

ted-jenks commented 2 months ago

@gszadovszky I agree having the automation would be much better, but I cannot get that to work. When using <Multi-Release> the jar still have the unshaded classes. I think that the use of the relocations doesn't agree with it somehow? Not sure since it isn't really documented. The approach in this PR does work though.

ted-jenks commented 2 months ago

Would it be possible to get a minor release of this? It is kinda dangerous to use the current release with anything that has a different jackson version.

ted-jenks commented 2 months ago

@gszadovszky @Fokko @wgtmac any updates on this? sorry for pushing it is just blocking us taking some fixes right now.

ted-jenks commented 2 months ago

@gszadovszky @Fokko @wgtmac 🙏🏻 would be super helpful to get this through

gszadovszky commented 2 months ago

Sorry, @ted-jenks, I was on vacation. I've quickly checked the content of the jenkins jars on my local maven repo and found the following versions in META-INF/versions: 9, 11, 17, 19, and 21. Maybe, we should simply list every available java release here to be sure? Unfortunately, I don't have time to manage a release currently. Maybe bring this up on the dev list after this PR went in.

z-anderson commented 2 months ago

@gszadovszky Thank you, I addressed your comment. (I work with Ted). https://github.com/apache/parquet-java/pull/3017/ . This is blocking us

wgtmac commented 2 months ago

For the next release, I think we can discuss on the dev@parquet ML. It may be the time to directly release 1.15.0 instead of 1.14.x

gszadovszky commented 2 months ago

@z-anderson, shall we close this PR in favor of #3017, then?

z-anderson commented 2 months ago

@z-anderson, shall we close this PR in favor of #3017, then? @gszadovszky I think yes

gszadovszky commented 2 months ago

Closing this one in favor of #3017