Open jackshirazi opened 1 year ago
Per discussion at the Java SIG, one next step is to provide a walkthrough of how an existing instrumentation is expected to change if it moved across to this mechanism after the mechanism was implemented
Per discussion at the Java SIG, the instrumentation using this mechanism is isolated and needs no shading but fully making the whole agent non-shaded using the ShadedClassLoader.java is not in this scope, but can be done in a subsequent scope if desired!
Per discussion at the Java SIG, one next step is to provide a walkthrough of how an existing instrumentation is expected to change if it moved across to this mechanism after the mechanism was implemented
This was added by @JonasKunz here providing a simple instrumentation change (Cassandra) and a complex one (Elasticsearch)
The next step is to provide a roadmap architecture of the implementation, following which we'll aim to start providing PRs of individual components using the architecture as a reference until we get to a fully working instrumentation. After that the intention is to convert all the existing instrumentations to the invokedynamic non-inlined implementation, targeting the 2.0 release in October, which will terminate support for extensions that use inline instrumentation capability
Here's a write-up of what we would envision as target architecture.
In addition I did a working PoC showcasing what things would like in implementation, without all the bells and whistles of course. The PoC compiles and migrates the Cassandra-Module to invokedynamic
. You can execute the tests and place a break-point in the Advice code.
Based on the experience with the elastic APM agent I would envision the following "ideal" classloader structure for the OpenTelemetry Agent:
The image shows the agent alongside two separate classloaders of the application being instrumented (App A and App B). Arrow represent a child-to-parent relationship. Explanation:
Agent Extension API CL
loads all classes against which extensions are compiled, including for example the OTel APIAgent CL
is a child of the Extension-API CL
. Those two are separated to ensure that implementation details of the Agent are hidden from extensions at runtime just like they are at compile time. This prevents extensions (especially external ones!) from accidentally relying on those implementation details.Agent Extension API CL
and the Agent CL
should follow the "self-first" instead of "parent-first" delegation model. This ensures that their classes cannot be shadowed by accident by the user, e.g. by providing bytebuddy as a dependency via -Xbootclasspath
.InstrumentationModule
s can share state and classes via the Global-State CL
. This should avoid the need for InstrumentationModules to inject classes into the Bootstrap CL, properly isolating those shared classes from the application being instrumented.Global-State CL
is separated from the Agent Extension API CL
for easier debugging (e.g. understanding where a class comes from) and to make sure that InstrumentationModules cannot mess up the classpath of the Agent CL
. The latter point should matter less due to the Agent CL
following the self-first delegation model.
The Global-State CL
should be self-first aswell, so that classes from bootstrap cannot mess it up.Advice
is applied, we lazily create the InstrumentationModule CL
for the Instrumentation Module containing that Advice
and the classloader containging the class being instrumented. If that CL already exists, it is reused.Instrumentation Module CL
uses the following classloading delegation model:
Advice
or a referenced helper class (and NOT a global state class), it is loaded by the Instrumentation Module CL
in self-first manner. The helper classes are detected by muzzle.Global-State CL
(and therefor its parents) are queried to find the classThe classloading strategy of the Instrumentation Module CL
ensures that application classes cannot shadow agent classes. We certainly need to add filtering for edge cases to this mechanism later (e.g. for bridging the opentelemetry-API from the application to the agent SDK).
To keep things simple for a start, we can keep the Agent CL
, Agent Extension API CL
and Global State CL
fused together as a single Agent CL
: This is how it is currently implemented in the elastic APM agent.
This reduces the isolation benefits explained above, but should be simpler to implement for a start. If in the long run we often run into issues / user questions which would be avoided by the proper isolation, we can always add it afterwards.
As discussed in the previous SIG meetings, we will add a switch-method to InstrumentationModules
to allow them to be marked as compatible for the invokedynamic approach.
We will adapt the OpenTelemetry Java Agent instrumentation logic to respect this switch-method
and insert invokedynamic
instructions instead of inlining the Advice code.
The inserted invokedynamic
instructions need to point to a bootstraping method which returns the actual target of the invokedynamic
instruction in the form of a MethodHandle
.
We will inject an IndyBootstrapDispatcher
into the Bootstrap CL
to forward this call into the actual implementation within the Agent CL
. This implementation takes care of setting up the corresponding InstrumentationModule CL
.
In the elastic APM agent the IndyBootstrapDispatcher
is the only class which is injected into the Bootstrap
(or any application classloader). There we shade this class into the java.lang
package to guarantee visibility, even if custom module systems, such as OSGi, are used.
AFAIK, the OpenTelemetry Java Agent uses classloader instrumentations to guarantee visibility of injected bootstrap classes. Therefore we won't need to shade the IndyBootstrapDispatcher
here. If in the future we get rid of all other use-cases for class injection, we could consider shading the IndyBootstrapDispatcher
into the java.lang
package, as it would allow us to get rid of specialized classloader instrumentation for individual module systems.
When instrumenting classes which are encapsulated within Java 9 modules, it is possible that the instrumentation needs to access non-exported contents of the modules. This usually shows by having to add --add-opens
compiler flags when compiling the instrumentation with Java 9+.
In order to facilitate these use-cases, InstrumentationModules
can declare modules they need to have access to. The invokedynamic bootstrapping will take care of providing the InstrumentationModule CL
with the required access via Instrumentation.redefineModule
.
The inline/injection based instrumentation approach of the OpenTelemetry agent allows to access any package-private members by injecting accessor classes into the desired packages. Because we won't allow invokedynamic-instrumentations to perform this injection, we need a different way of gaining this access.
This can be efficiently done by combining Instrumentation.redefineModule
with MethodHandles.privateLookupIn
. For Java 8 we can simply use setAccessible(true)
.
This allows us to add a MemberAccess
utility to the extension API with which the advices can then simply declare MethodHandles
for the required fields and methods:
private static final MethodHandle fieldGetter = MemberAccess.fieldGetter(SomeSecret.class, "secretField");
Due to the MethodHandle
being static final
, the performance should be equivalent to direct field accesses / method calls after JIT.
You can find a PoC implementation here.
Another current use-case for injecting classes into application classloaders is to inject ServiceProviders.
In order to provide a consistent development experience, we would need to find a way to make the service providers to "inject" actually live inside the InstrumentationModule CL
.
I see two alternative approaches for doing this:
ServiceLoader
methods to return instances from the InstrumentationModule CL
ServiceLoader
. Upon creation, this proxy creates the actual implementation via invokedynamic
in the InstrumentationModule CL
and delegates all calls to that instance.Based on the outlined architecture above, I would suggest the following implementation order:
InstrumentationModuleClassloader
Advices
as helper classes for invokedynamic-modulesMemberAccess
utility explained above and migrate at least one instrumentation modules which requires it
Tracking
Related PRs
Note not for inclusion in this issue are preferred post completion task to remove shading from the agent
[COMPLETED] Phase 1 - Enabling invoke dynamic capability for instrumentations
Phase 2a - Migrate simple instrumentation modules
Phase 2b - Additional support for muzzle and complex instrumentation modules
InstrumentationModules
to optionally share theirInstrumentationModuleClassloader
(see this discussion)Phase 3 - Migrate remaining instrumentation modules
Proposal
Contribute alternative invokedynamic based instrumentation mechanism to the OpenTelemetry Java agent, with no required changes to the existing OpenTelemetry instrumentation mechanism or instrumentations
Benefits to OpenTelemetry of this contribution and mechanism
Benefits to Elastic
Background