veracitylab / provenance-injector

inject provenance into JEE applications
Apache License 2.0
0 stars 0 forks source link

Include agent classes in the classpath only once #19

Open wtwhite opened 4 months ago

wtwhite commented 4 months ago

For some app types, like Spring Boot apps, it currently seems necessary to include some or all of the provenance-injector class files on the classpath of the main app itself, e.g., with a Gradle command like:

    implementation 'nz.ac.wgtn.veracity:provenance-injector:1.3-SNAPSHOT'

This is certainly necessary if we want to access classes/interfaces like ProvenanceAgent directly (non-reflectively) from within main app controllers/interceptors.

However, running java -javaagent:/path/to/provenance-agent.jar -jar mainapp.jar then causes these classes to appear in the classpath twice (once via -javaagent, once via a dependency jar nested inside the main app jar/war).

My testing shows that, with Spring Boot, the classes provided via -javaagent take precedence both at agent startup time and when accessing them from inside the main app. That is, this currently seems to work -- but it's fragile, since it depends on which class loaders Spring Boot uses to load the application, and the details of how they work. It would be easy for the jar loaded with -javaagent to get out of sync with the one nested inside the main app jar, and if Spring's class loader logic changed, it could lead to a hard-to-understand bug.

We can't simultaneously:

  1. access agent classes by name (i.e., non-reflectively) from the main web app,
  2. omit the agent jar from the main web app jar, and
  3. avoid crashing at startup with NoClassDefFoundError if -javaagent is omitted from the command line.

Dropping any of the 3 requirements above leads to a working solution. Currently we drop (2).

Possible solutions

Live with 2 copies of the agent classes in the classpath

What we're currently doing. A footgun.

Access all agent classes reflectively

Simple enough, but ugly, slow and prone to break at runtime instead of compile time if we change the agent implementation.

Shrink the duplicated classes as far as possible

The idea would be to make 2 implementations of a factory singleton named something like ProvenanceAgentFactoryToBeReplacedAtRuntimebyJavaAgent and containing just a ProvenanceTracker makeProvenanceTracker() a method -- one that the main web app bundles as a nested jar (perhaps returning a NoopProvenanceTracker), the other that the agent includes, and which returns a genuine tracker. But this doesn't work well since the main app will need to duplicate (at least stub implementations of) all types mentioned in the interface (Invocation, Activity, Entity, ...).

(I considered going full ServiceLoader instead of using the same class name in two places, but it doesn't really help and only adds complexity.)

Exclude java agent jar from the main web app

  1. Change implementation to compileOnly in build.gradle to force Gradle to exclude provenance-agent.jar from the jar file it builds (changing it to providedRuntime as suggested means it still gets included in the jar, just under WEB-INF/lib-provided instead of WEB-INF/lib).

The downside is that forgetting -javaagent on the command line will lead to an ugly NoClassDefFoundError at startup.

Include only "exploded" java agent classes in the main web app, ditch -javaagent

JDK 11 (and maybe earlier, but not JDK 8) have a Launcher-Agent-Class manifest entry that can enable an agent to be started automatically (i.e., without specifying -javaagent), though it can't read nested jar files.

  1. Change implementation to compileOnly in build.gradle as before
  2. Add "exploded" agent jar directly into the main web app jar (requires an additional manual step, unless I can figure out how to make Gradle do this itself)
  3. Append the following to META-INF/MANIFEST.MF:
    Can-Redefine-Classes: true
    Can-Retransform-Classes: true
    Launcher-Agent-Class: nz.ac.wgtn.veracity.provenance.injector.instrumentation.ProvenanceAgent
  4. Run the instrumented app with just java -jar mainapp.jar ๐Ÿ˜„

Testing confirms that this works on a manually modified .war file ๐Ÿ˜„

This way we can't forget -javaagent, and we have just one copy of the agent classes in the classpath, so there are no footguns. Modifying the Gradle build (or (automatically) modifying the jar/war file it produces post-build) is some work and complexity though.

wtwhite commented 4 months ago

There may be a better way, at least for Spring Boot. It may be possible to implement all agent-related code (namely, the HTTP provenance pickup endpoint (a method sporting @RequestMapping inside a @Controller in Spring Boot terms), and the servlet filter (implemented as an Interceptor with Spring)) in the agent after all, leaving the main app with no dependency on provenance-injector at all:

As of Spring Framework 4.2, @Import also supports references to regular component classes, analogous to the AnnotationConfigApplicationContext.register method.

If this works, it would be ideal: The app would have no dependency on this repo, and could be built to either:

wtwhite commented 4 months ago

Testing with a dummy Spring Boot app shows that the previous comment's META-INF/spring.factories + @Import(SomeController.class) works, at least for controllers ๐Ÿ˜

Specifically, I:

  1. Made a simple Maven-based app that parents from org.springframework.boot:spring-boot-starter-parent:2.5.5 and depends on org.springframework.boot:spring-boot-starter-web per the instructions
  2. Added a @Controller component class named wtwhitetest.DummyController with a single method annotated with @RequestMapping("/dummy") and returning ResponseEntity.ok().body("blah");
  3. Added an empty wtwhitetest.spring.DummyControllerInAgentAutoConfiguration class labeled with @Configuration and @Import(wtwhitetest.DummyController.class)
  4. Added src/main/resources/META-INF/spring.factories containing org.springframework.boot.autoconfigure.EnableAutoConfiguration=wtwhitetest.spring.DummyControllerInAgentAutoConfiguration
  5. Built a regular jar using mvn package (instead of the Spring Boot plugin)
  6. Copied that jar file to WEB-INF/lib/test-spring-boot-controller-in-agent-1.0-SNAPSHOT.jar and inserted it into the Spring Boot main app war file with zip -r -Z store with_exploded_embedded_provenance-agent_jar_and_dummy_controller_embedded_jar.war WEB-INF
    • NOTE: Spring needs the jar to be embedded without compression, hence -Z store
  7. Started the app with java -jar with_exploded_embedded_provenance-agent_jar_and_dummy_controller_embedded_jar.war
  8. From a separate terminal, ran curl -i http://localhost:8080/dummy and observed that:
    • It worked ๐Ÿ‘
    • The Interceptor defined in the main app was still called ๐Ÿ‘

Still TODO:

  1. Test whether it also works for Interceptors, which require more setup via WebMvcConfigurerAdapter
  2. Test whether it still works when the controller/interceptor needs to talk to classes in the agent
wtwhite commented 4 months ago
  1. Interceptors defined in the embedded jar work ๐Ÿ‘
  2. Confirmed that the controller can talk to agent classes ๐Ÿ‘ Specifically, after adding the provenance injector as a dependency to the dummy app with <scope>provided</scope>:
    • Its /dummy controller can create a NoopProvenanceTracker without problems
    • No provenance injector classes are included in the dummy app jar, meaning the JVM is correctly picking up the "exploded" provenance .class files from the main web app
      • Confirmed by removing NoopProvenanceTracker.class from the app war and rerunning; the app starts up, but the first /dummy request results in Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Handler dispatch failed; nested exception is java.lang.NoClassDefFoundError: nz/ac/wgtn/veracity/provenance/injector/tracker/NoopProvenanceTracker]

I also confirmed that each part of the configuration (META-INF/spring.factories, the empty @Configuration class, the @Import on it) is necessary for the /dummy endpoint to work.

Conclusion: We can entirely remove all dependencies on this repo from a Spring Boot app ๐Ÿ˜„