aboutcode-org / purldb

Tools to create and expose a database of purls (Package URLs). This project is sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase/ and nexB for https://www.aboutcode.org/ Chat is at https://gitter.im/aboutcode-org/discuss
https://purldb.readthedocs.io/
35 stars 23 forks source link

Support detecting/matching uberjars #69

Open pombredanne opened 1 year ago

pombredanne commented 1 year ago

An Uberjar is a JAR combining many JARs in repackaged format. In contrast with a fatjar, it does not contain nested JAR-in-JAR. https://maven.apache.org/plugins/maven-shade-plugin/ is one of the tools that creates these.

The analysis of such as JAR is challenging because the contents of many JARs are mixed in a single JAR.

pombredanne commented 1 year ago

@JonoYang ping

JonoYang commented 1 year ago

@pombredanne This is the current way uberjars are handled:

The rest of the matching process is handled by directory matching.

pombredanne commented 1 year ago

A good example of uberjat is the closure-compiler: See also google/closure-compiler#4104

There seems to be several embeds and Jarjars in this uber jar that are weakly documented (e.g., not documented at all).

As such neither the binary, the source nor the git repo contain a comprehensive documentation of the various bundled packages.

pombredanne commented 1 year ago

Another example is: jline 2.12 which is shading jansi. In https://repo1.maven.org/maven2/jline/jline/2.12/jline-2.12.pom is shading org.fusesource/jansi 1.11 https://repo1.maven.org/maven2/org/fusesource/jansi/jansi/1.11/jansi-1.11.pom

And jansi is shading groupId=org.fusesource.hawtjni artifactId=hawtjni-runtime version=1.8

.... this is uberjars all the way as we have:

And jansi-native 1.5 is made of many JARs for each OS: https://repo1.maven.org/maven2/org/fusesource/jansi/jansi-native/1.5/ And we would need to index them all