oracle / oci-hdfs-connector

HDFS Connector for Oracle Cloud Infrastructure
https://cloud.oracle.com/cloud-infrastructure
Other
27 stars 26 forks source link

Full über jar contains unshaded problematic third party libraries #52

Closed dmeibusch closed 2 years ago

dmeibusch commented 3 years ago

In my particular case, my Spark job is using objectweb ASM and we were getting surprising results. Eventually tracked down to the Spark classpath containing conflicting versions of the ASM library coming from the oci-hdfs-connector. The connector has as number of similar third-party library dependencies included that are bound to conflict with Spark applications.

dmeibusch commented 3 years ago

I've tweaked the maven-shade-plugin configuration and are testing the results. If all good, I'll submit a PR.

dmeibusch commented 3 years ago

First attempt I shaded all the third party jars - there are quite a few. But this broke something. Conservatively, I've just shaded ASM. Other strong candidates would be javassist, definitely aopalliance. Leaving these unshaded is asking for trouble. I noticed also SLF4J API classes in there - I'd have thought these would be better left out altogether as a dependency. Also netty and jetty - can these be pruned at all - so many classes in there, do we need them all? And a full log4j implementation that probably should also be left out so it can be replaced with an slf4j or log4j2 facade for those that want federated logging to work.

mricken commented 2 years ago

@dmeibusch Hi, sorry for the long delay. I'll test your change and plan to merge it for the next release.