bazel-contrib / rules_jvm_external

Bazel rules to resolve, fetch and export Maven artifacts
Apache License 2.0
336 stars 256 forks source link

Feature request: maven dependency tree #499

Open ropnop opened 3 years ago

ropnop commented 3 years ago

Hello! Apologies if this isn't the best forum for feature requests, but wanted thoughts/opinions on adding an analog to mvn dependency:tree as a rule.

I believe this should be possible via the maven_coordinates tag that gets added to each dependency and think I see path forward looking at the code in pom_file.bzl.

We are scanning maven dependencies for vulnerabilities with Snyk, but sometimes it's difficult to determine what maven artifact is providing the transitive dependency on an out-of-date library. Currently the easiest approach is to run the java_export rule and create a POM file, then run mvn dependency:tree on that pom file to visually see the Maven artifact versions and dependencies.

Would it be possible to add a pure Starlark rule that creates and output similar to mvn dependency:tree that includes the maven coordinates and versions? I'm happy to take a stab at something too if you feel it is worthwhile. Thanks!

divanorama commented 3 years ago

Depending on the exact requirement, you could do bazel query directly, like so, after identifying unwanted dependencies

$ bazel query 'somepath(//..., @maven//:org_hamcrest_hamcrest_core)'
//:android_test_deps
@maven//:junit_junit
@maven//:org_hamcrest_hamcrest_core

and to get bazel target name from maven coordinates

$ bazel query 'attr(tags, "maven_coordinates=org.hamcrest:hamcrest-core:1.3", kind(jvm_import, @maven//...))'
@maven//:org_hamcrest_hamcrest_core

this is for example project from README.md

To produce a full graph aspects may be a good way.

How do you currently invoke snyk on bazel project by the way?

ropnop commented 3 years ago

Thank you for the ideas! I feel like bazel query is so close to what I need, but I just can't seem to tweak it correctly. This will construct a graph of JVM imports that have Maven coordinates, but it won't actually display the coordinates (i.e. versions):

bazel query 'attr(tags, "maven_coordinates=", kind(jvm_import, deps(//exampleproject:myjavalib)))' --notool_deps --output graph

The Snyk integration is a bit hacky and manual, but we are using the generic dep graph API. I use an Aspect to recursively bubble up every maven coordinate tag for a target, then a Bazel rule to construct an API call from all the coordinates and send it to Snyk. This works well enough, but the problem is the graph is completely "flat" - so if Snyk comes back with a vuln for a particular version of a maven dependency it takes quite a bit of detective work to figure out where that particular maven coordinate was introduced so we can upgrade.

Do you think the MavenInfo provider could somehow be tweaked to keep track of maven coordinates in a hierarchy?

divanorama commented 3 years ago

For MavenInfo I see "maven_deps" and "coordinates", so they could be useful.

Not sure if bazel query --output graph is configurable, but you could try post-processing it for example by replacing labels like @maven//:org_hamcrest_hamcrest_core with @maven//:org_hamcrest_hamcrest_core_1_2_3 (replacement list can be constructed from bazel query xml or build output).

Or, since you have some aspect already, you can try adding a new provider with dependency graph information, similar to file count aspect example. Some plumbing to do though, some options of what to put into the provider data

which is populated like this:

Or a quick and dirty way could be to use print from simple aspect example - propagate aspect through deps&runtime_deps, dump dependency graph with maven coordinates attached to nodes which have [MavenInfo].coordinates, collect&filter the output