k8ssandra / management-api-for-apache-cassandra

RESTful / Secure Management Sidecar for Apache Cassandra
Apache License 2.0
70 stars 51 forks source link

K8SSAND-1210 ⁃ Eliminate duplicate classes during JAR shading process #176

Open emerkle826 opened 2 years ago

emerkle826 commented 2 years ago

When building the project jarfiles, each of the project modules has a JAR shading step that bundles all of the project's dependencies into an "uber" jar. However, some of the projects have duplicate transient dependencies. In some cases, the transient dependencies contain multiple versions of the same library.

While this hasn't caused any known issues so far, it really should be cleaned up so that there is no ambiguity in which versions of which libraries are bundled into the jarfiles.

Also, the common project modules (i.e. management-api-common, management-api-agent-common) shade their artifacts as well, but likely don't need to do so. Since they have some of the same dependencies as other modules that have these declared as dependencies, the shading process also finds duplicates as well.

┆Issue is synchronized with this Jira Story by Unito

bradfordcp commented 2 years ago

Hey team! Please add your planning poker estimate with ZenHub @emerkle826 @jeffbanks

emerkle826 commented 2 years ago

Please add your planning poker estimate with ZenHub @adutra

emerkle826 commented 2 years ago

I would like to also get @adutra to chime in here, as he has gone through a similar process with the Java driver a while back. Going through the process of eliminating duplicate jars will take some time. But the bigger issue is verifying that it doesn't break something. My initial estimate is an 8. I don't know if the benefit will justify the level of effort.

adutra commented 2 years ago

For some reason I cannot cast my vote. I'd say 5 story points. And yes, eliminating duplicate classes is long and sometimes not worth the effort as long as the resulting jar works.

Taking a step back: I'd advocate for not using shaded uber-jars at all. Uber-jars are inherently error-prone. What is the reason why we are doing this in the management-api?

emerkle826 commented 2 years ago

What is the reason why we are doing this in the management-api?

I'm pretty sure the server part of Management API uses an uber-jar since it runs as a stand-alone process and needs all the things it depends on (things like RESTEasy, Netty, etc). But the agent jarfiles should be able to leverage most of the dependencies as "provided" since the agents are embedded into the Cassandra/DSE JRE classpath. Still, there are likely a few things the agent depends on that are not already in the Cassandra classpath, so those dependencies would need to be included.

Perhaps the agent jarfiles published could be "slimmer" uber-jars. As for the Docker images we build with the agent jarfiles already added to the Cassandra classpath, we could maybe add non-uber-jars and add the dependencies individually.

jeffbanks commented 2 years ago

I too am unable to vote using the plugin on this one. I'm not familiar with all that is related for this one, which is why I have the higher estimate here.

Estimation: +8