open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.88k stars 823 forks source link

Proposal to contribute invokedynamic based instrumentation mechanism #8999

Open jackshirazi opened 1 year ago

jackshirazi commented 1 year ago

Tracking

  1. The original proposal is below in this description. This was accepted for proceeding on in Java SIG meeting July 20 2023, after example walkthroughs of instrumentation being moved are provided
  2. A couple of examples of how the proposed changes would affect existing instrumentation, if moved to the new framework were provided for the following SIG meeting July 27 2023, and a decision to proceed on the proposal was then agreed
  3. The proposed architecture outline and an implementation plan outline were added in this comment and PRs for the implementation began - a working POC is available for reference

Related PRs

Note not for inclusion in this issue are preferred post completion task to remove shading from the agent

[COMPLETED] Phase 1 - Enabling invoke dynamic capability for instrumentations

Phase 2a - Migrate simple instrumentation modules

Phase 2b - Additional support for muzzle and complex instrumentation modules

Phase 3 - Migrate remaining instrumentation modules


Proposal

Contribute alternative invokedynamic based instrumentation mechanism to the OpenTelemetry Java agent, with no required changes to the existing OpenTelemetry instrumentation mechanism or instrumentations

Benefits to OpenTelemetry of this contribution and mechanism

Benefits to Elastic

Background

  1. The Elastic APM Java agent is fully open source on the Apache 2.0 license
  2. The agent instrumentation uses Byte Buddy delegated advice (allows breakpoint setting) - the recommended approach by the byte buddy creator Rafael Winterhalter, which reduces agent implementation complexity
  3. To support delegated advice, the agent has implemented custom classloaders that enable access and reversion (the instrumentation has correct isolated access, and can be unloaded and the transformed code fully reverted)
  4. To support the delegated advice being able to access the instrumented methods runtime state, the agent uses Byte Buddy to insert an invokedynamic bytecode instructions -

the invokedynamic instruction can be used to call an advice method that is loaded from a child class loader of the instrumented class' defining class loader ... this allows the agent to hide its classes from the application while providing a way to invoke the isolated methods from the application classes it instruments ... also avoids injecting the advice and helper classes into the target class loader directly

jackshirazi commented 1 year ago

Per discussion at the Java SIG, one next step is to provide a walkthrough of how an existing instrumentation is expected to change if it moved across to this mechanism after the mechanism was implemented

jackshirazi commented 1 year ago

Per discussion at the Java SIG, the instrumentation using this mechanism is isolated and needs no shading but fully making the whole agent non-shaded using the ShadedClassLoader.java is not in this scope, but can be done in a subsequent scope if desired!

jackshirazi commented 1 year ago

Per discussion at the Java SIG, one next step is to provide a walkthrough of how an existing instrumentation is expected to change if it moved across to this mechanism after the mechanism was implemented

This was added by @JonasKunz here providing a simple instrumentation change (Cassandra) and a complex one (Elasticsearch)

jackshirazi commented 1 year ago

The next step is to provide a roadmap architecture of the implementation, following which we'll aim to start providing PRs of individual components using the architecture as a reference until we get to a fully working instrumentation. After that the intention is to convert all the existing instrumentations to the invokedynamic non-inlined implementation, targeting the 2.0 release in October, which will terminate support for extensions that use inline instrumentation capability

JonasKunz commented 1 year ago

Here's a write-up of what we would envision as target architecture. In addition I did a working PoC showcasing what things would like in implementation, without all the bells and whistles of course. The PoC compiles and migrates the Cassandra-Module to invokedynamic. You can execute the tests and place a break-point in the Advice code.

Classloader Structure

Based on the experience with the elastic APM agent I would envision the following "ideal" classloader structure for the OpenTelemetry Agent:

otel_cl_ideal drawio

The image shows the agent alongside two separate classloaders of the application being instrumented (App A and App B). Arrow represent a child-to-parent relationship. Explanation:

otel_cl_simplified drawio

This reduces the isolation benefits explained above, but should be simpler to implement for a start. If in the long run we often run into issues / user questions which would be avoided by the proper isolation, we can always add it afterwards.

Invokedynamic Advice Bootstraping

As discussed in the previous SIG meetings, we will add a switch-method to InstrumentationModules to allow them to be marked as compatible for the invokedynamic approach.

We will adapt the OpenTelemetry Java Agent instrumentation logic to respect this switch-method and insert invokedynamic instructions instead of inlining the Advice code. The inserted invokedynamic instructions need to point to a bootstraping method which returns the actual target of the invokedynamic instruction in the form of a MethodHandle.

We will inject an IndyBootstrapDispatcher into the Bootstrap CL to forward this call into the actual implementation within the Agent CL. This implementation takes care of setting up the corresponding InstrumentationModule CL.

In the elastic APM agent the IndyBootstrapDispatcher is the only class which is injected into the Bootstrap (or any application classloader). There we shade this class into the java.lang package to guarantee visibility, even if custom module systems, such as OSGi, are used.

AFAIK, the OpenTelemetry Java Agent uses classloader instrumentations to guarantee visibility of injected bootstrap classes. Therefore we won't need to shade the IndyBootstrapDispatcher here. If in the future we get rid of all other use-cases for class injection, we could consider shading the IndyBootstrapDispatcher into the java.lang package, as it would allow us to get rid of specialized classloader instrumentation for individual module systems.

Interaction of Invokedynamic-Advices with Java 9 modules

When instrumenting classes which are encapsulated within Java 9 modules, it is possible that the instrumentation needs to access non-exported contents of the modules. This usually shows by having to add --add-opens compiler flags when compiling the instrumentation with Java 9+.

In order to facilitate these use-cases, InstrumentationModules can declare modules they need to have access to. The invokedynamic bootstrapping will take care of providing the InstrumentationModule CL with the required access via Instrumentation.redefineModule.

Accessing package-private / private members

The inline/injection based instrumentation approach of the OpenTelemetry agent allows to access any package-private members by injecting accessor classes into the desired packages. Because we won't allow invokedynamic-instrumentations to perform this injection, we need a different way of gaining this access.

This can be efficiently done by combining Instrumentation.redefineModule with MethodHandles.privateLookupIn. For Java 8 we can simply use setAccessible(true). This allows us to add a MemberAccess utility to the extension API with which the advices can then simply declare MethodHandles for the required fields and methods:

 private static final MethodHandle fieldGetter = MemberAccess.fieldGetter(SomeSecret.class, "secretField");

Due to the MethodHandle being static final, the performance should be equivalent to direct field accesses / method calls after JIT.

You can find a PoC implementation here.

SPI injection

Another current use-case for injecting classes into application classloaders is to inject ServiceProviders. In order to provide a consistent development experience, we would need to find a way to make the service providers to "inject" actually live inside the InstrumentationModule CL. I see two alternative approaches for doing this:

Implementation Plan

Based on the outlined architecture above, I would suggest the following implementation order: