Closed DenWav closed 9 months ago
A followup, by shading dependencies other things are broken which aren't listed in my original issue description. Another thing that's broken is logging - Apache dependencies such as HttpClient log to Apache commons-logging, but since this is a custom shaded version of the jar and not a real dependency, this can't easily be replaced or redirected to be sent to SLF4J or Log4j.
If the this artifact was published as a standard library there would be no issues.
FYI @sfc-gh-anugupta
I also wanted to point out that the shaded JAR size causes an issue for organizations targeting AWS Lambda or other serverless runtimes. Our org deploys to AWS Lambda and artifacts are not allowed to exceed 250MB when unpacked in that environment. I did the fix suggested here and it did lower the size constraints enough to deploy to AWS Lambda, but I now need to test and make sure that all the dependencies required for it are available.
+1 for this.
Having a lighter JAR is also a constraint for us. The reason is we package our Java lib in a Python lib and https://pypi.org/ have restrictions on the size of the package. Just because of Snowflake JDBC driver and all the duplicates it creates we cannot publish our lib anymore.
Having old dependencies shaded in Snowflake JAR is also a security breach for us. We have tools scanning for known security issues in our dependencies and let us know we have to upgrade this dependency. But if the dependency is shaded by Snowflake it cannot be scanned, so we might miss some security issues and we have no way to manually fix them.
Last week there was the first release of snowflake jdbc thin jar in version 3.14.5 (It's marked as experimental in release notes since first we are going to use it and test internally) It still shades some libraries (e.g. it has custom builds of arrow, tika), but most of the other libraries are dependencies in POM
Awesome thank you :) We will try it
THANK YOU. I will be testing this too!
The current jar shades all dependencies, which is very bad practice for libraries. This means users and consumers have no control over the libraries this Snowflake JDBC driver jar uses, are required to have multiple copies of them in cases where they are already using some of those libraries, and it makes the jar excessively large.
This was brought up before but it's unclear to me why that was marked as "done" when it wasn't at all: #174
This has more significant implications as well, because in situations like #1512 users are forced to wait for Snowflake to update the driver, when if this was deployed to Maven as a proper library with defined dependencies, users could address the issue themselves until Snowflake is able to push a real fix.
Library conflicts are not a valid reason for shading a library like this, modern dependency management tools such as Maven, Gradle, etc. have built-in functionality specifically to allow the developers who are using your library to make sure they have no conflicts. In fact, by shading your libraries in this fashion you increase conflicts for the reasons mentioned above.
If it's really a desire to deploy a shaded artifact (it should not be, to be clear), then it's still possible, you can simply deploy it or the proper non-shaded artifact under a separate classifier.