open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.95k stars 855 forks source link

NoClassDefFoundError when building native image #8919

Open TDtianzhenjiu opened 1 year ago

TDtianzhenjiu commented 1 year ago

Describe the bug NoClassDefFoundError when building native image

Steps to reproduce

  1. create a simple spring boot3 project
  2. executing this command for build native ./mvnw clean -Pnative native:compile native:build

What did you expect to see? Should build success

What did you see instead? causing this exception:

Caused by: java.lang.NoClassDefFoundError: io/opentelemetry/javaagent/tooling/field/VirtualFieldImplementationsGenerator
        at java.base/java.lang.Class.getDeclaringClass0(Native Method)
        at java.base/jdk.internal.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:568)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.SVMHost.getDeclaringClass(SVMHost.java:432)

What version are you using? 1.27.0(latest)

Environment Compiler: (e.g., "GraalVM 17.0.7+8.1") OS: (e.g., "MacOs")

jeanbisutti commented 1 year ago

Hi @TDtianzhenjiu

OpenTelemetry Java agent can't work with native images. Today, GraalVM native images don't officially support Java agents.

With GraalVM native images and Spring Boot, you can use the OpenTelemetry Spring Starter. An example here. A new OTel Java instrumentation version will be released this week. This release will make it possible to also have logging instrumentation with Spring Boot native images (see).

TDtianzhenjiu commented 1 year ago

I@jeanbisutti I see! thanks a lot !

BTW, I would say that I compiled other java agents with native-image successfully. That proved native-image support agent in fact. In my mind, we can improve something on the java agent side to make the Java agent be compiled with native-image

So, do we have a plan to support the java agent for native-image?

jeanbisutti commented 1 year ago

@TDtianzhenjiu

Yes, you can build native images with some Java agents.

However, Java agents are not officially supported with GraalVM native images today and can potentially break the image builder, see https://github.com/oracle/graal/issues/1065

mateuszrzeszutek commented 1 year ago

However, Java agents are not officially supported with GraalVM native images today and can potentially break the image builder, see oracle/graal#1065

Without an official solution in place, there's hardly anything we can do. For now, we recommend using library instrumentations (e.g. the Spring Starter that Jean mentioned); or using autoconfigure and instrumenting your application manually.

TDtianzhenjiu commented 1 year ago

Hi @jeanbisutti

After I had a deep discovery of this issue, I found io.opentelemetry.javaagent.bootstrap.field.VirtualFieldImpl$java$lang$Runnable$io$opentelemetry$javaagent$bootstrap$executors$PropagatedContext loaded by bootstart classloader in this line : https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/muzzle/src/main/java/io/opentelemetry/javaagent/tooling/HelperInjector.java#L317

while the VirtualFieldImplementationsGenerator is loaded by AgentClassLoader.

After I changed to use AgentClassLoader to load io.opentelemetry.javaagent.bootstrap.field.VirtualFieldImpl$java$lang$Runnable$io$opentelemetry$javaagent$bootstrap$executors$PropagatedContext

this exception was gone

jeanbisutti commented 1 year ago

@TDtianzhenjiu Great!

It's possible to detect a native execution in this way:

    private static boolean isNativeRuntimeExecution() {
        String imageCode = System.getProperty("org.graalvm.nativeimage.imagecode");
        return imageCode != null;
    }

@TDtianzhenjiu could you join the Java SIG on next Thursday at 09:00 PT (see) to discuss a potential PR with the @open-telemetry/java-instrumentation-approvers?

ziyilin commented 1 year ago

Hi @TDtianzhenjiu , I'm currently working on supporting agent in GraalVM native image. It can trace the transformed classes and build them into the native image now. I have tested over simple instrumenting cases. I am looking for instrumentation framework demo to test. Can I have the complete reproducing program of this post? As I'm very new to OT, can't figure out how to reproduce your test. Thank you very much.

TDtianzhenjiu commented 1 year ago

Hi @ziyilin sorry to reply to you late!

the version of the java agent is opentelemetry-javaagent-1.29.0-SNAPSHOT-base.jar

and I used the maven plugin : <groupId>org.graalvm.buildtools</groupId>

using this simply mvn command can produce that issue ./mvnw -Pnative clean native:compile

Can you compile the agent to the native-image successfully now?

I think It is good news, what's the version of the Java agent, and what's the version of the graalVM?

BTW, is there any gmail group or slack workspace we can talk about conveniently?

TDtianzhenjiu commented 1 year ago

@TDtianzhenjiu Great!

It's possible to detect a native execution in this way:

    private static boolean isNativeRuntimeExecution() {
        String imageCode = System.getProperty("org.graalvm.nativeimage.imagecode");
        return imageCode != null;
    }

@TDtianzhenjiu could you join the Java SIG on next Thursday at 09:00 PT (see) to discuss a potential PR with the @open-telemetry/java-instrumentation-approvers?

Sorry to reply to you late

I checked the time of the meeting is not suitable for me since I am in Asia, Is there any gmail group or slack workspace to discuss

jeanbisutti commented 1 year ago

@TDtianzhenjiu There is an otel-java channel on the CNCF slack. You can join the CNCF slack by following these instructions.

The discussion could also continue in this Github issue.

@open-telemetry/java-instrumentation-approvers what do you think about adding an experimental feature to try to make the OpenTelemetry Java agent works with GraalVM native images?

See https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/8919#issuecomment-1633864257

jeanbisutti commented 1 year ago

@TDtianzhenjiu We have just discussed about this issue during the Java SIG. Could you please create a PR with your fix? The discussion could follow based on this PR.

TDtianzhenjiu commented 1 year ago

@TDtianzhenjiu We have just discussed about this issue during the Java SIG. Could you please create a PR with your fix? The discussion could follow based on this PR.

Hi @jeanbisutti

although I resolved the issue of NoClassDefFoundError, I encountered another issue when compiling to native-image.

the exception is: com.oracle.svm.core.util.VMError$HostedError: guarantee failed.

after I debugged, I found this class had been registered twice. net.bytebuddy.description.type.TypeDescription$Generic

I think I need time to investigate this issue in depth and resolve it 🙏

jeanbisutti commented 1 year ago

Hi @TDtianzhenjiu!

You may encounter a third problem after solving the second. By creating a first PR, it would be possible to see what the code looks like to fix the NoClassDefFoundError, and it would help to have a small PR.

trask commented 1 year ago

(re-opening since there is active ongoing discussion)

ziyilin commented 1 year ago

Hi @ziyilin sorry to reply to you late!

the version of the java agent is opentelemetry-javaagent-1.29.0-SNAPSHOT-base.jar

and I used the maven plugin : <groupId>org.graalvm.buildtools</groupId>

using this simply mvn command can produce that issue ./mvnw -Pnative clean native:compile

Can you compile the agent to the native-image successfully now?

I think It is good news, what's the version of the Java agent, and what's the version of the graalVM?

BTW, is there any gmail group or slack workspace we can talk about conveniently?

Thanks. My work now can automatically record two kinds of data for application instrumented by OT agent(1.29.0) at runtime.

  1. All classes transformed by OT agent.
  2. All classes dynamically generated by OT agent.

From the resulting data, I can see the biggest problem to get OT agent natively compiled with the application is OT transforms some JDK classes such as java.lang.Class. Such classes are modified by GraalVM as well for native image adaption. Therefore, the modifications between GraalVM and OT agent must be compatible. I will investigate this issue and figure out how to resolve it.

You can reach me through cengfeng.lzy@alibaba-inc.com. This is also my Slack account.

jeanbisutti commented 1 year ago

@TDtianzhenjiu We have just discussed about this issue during the Java SIG. Could you please create a PR with your fix? The discussion could follow based on this PR.

@TDtianzhenjiu and @ziyilin, following SIG meeting discussions, PRs are welcome even if things remain to fix.

deki commented 11 months ago

Hi folks, I just tried using 1.32.0 and still hitting:

128.3 Caused by: org.graalvm.compiler.java.BytecodeParser$BytecodeParserError: com.oracle.graal.pointsto.constraints.UnsupportedFeatureException: Discovered a type for which getDeclaringClass0 cannot be called: io.opentelemetry.javaagent.bootstrap.field.VirtualFieldImpl$java$util$concurrent$ForkJoinTask$io$opentelemetry$javaagent$bootstrap$executors$PropagatedContext. This error is reported at image build time because class io.opentelemetry.javaagent.bootstrap.field.VirtualFieldImpl$java$util$concurrent$ForkJoinTask$io$opentelemetry$javaagent$bootstrap$executors$PropagatedContext is registered for linking at image build time by system default
128.3   at parsing java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:754)
128.3   at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.throwParserError(BytecodeParser.java:2536)
128.3   at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.phases.SharedGraphBuilderPhase$SharedBytecodeParser.throwParserError(SharedGraphBuilderPhase.java:169)
128.3   at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.iterateBytecodesForBlock(BytecodeParser.java:3414)
128.3   at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.phases.SharedGraphBuilderPhase$SharedBytecodeParser.iterateBytecodesForBlock(SharedGraphBuilderPhase.java:712)
128.3   at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.handleBytecodeBlock(BytecodeParser.java:3366)
128.3   at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.processBlock(BytecodeParser.java:3208)
128.3   at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.build(BytecodeParser.java:1134)
128.3   at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.phases.SharedGraphBuilderPhase$SharedBytecodeParser.build(SharedGraphBuilderPhase.java:152)
128.4   at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.buildRootMethod(BytecodeParser.java:1026)
128.4   at jdk.internal.vm.compiler/org.graalvm.compiler.java.GraphBuilderPhase$Instance.run(GraphBuilderPhase.java:97)
128.4   at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.phases.SharedGraphBuilderPhase.run(SharedGraphBuilderPhase.java:114)
128.4   at jdk.internal.vm.compiler/org.graalvm.compiler.phases.Phase.run(Phase.java:49)
128.4   at jdk.internal.vm.compiler/org.graalvm.compiler.phases.BasePhase.apply(BasePhase.java:434)
128.4   at jdk.internal.vm.compiler/org.graalvm.compiler.phases.Phase.apply(Phase.java:42)
128.4   at jdk.internal.vm.compiler/org.graalvm.compiler.phases.Phase.apply(Phase.java:38)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.flow.AnalysisParsedGraph.parseBytecode(AnalysisParsedGraph.java:146)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.meta.AnalysisMethod.parseGraph(AnalysisMethod.java:819)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.meta.AnalysisMethod.ensureGraphParsedHelper(AnalysisMethod.java:784)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.meta.AnalysisMethod.ensureGraphParsed(AnalysisMethod.java:767)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.flow.MethodTypeFlowBuilder.parse(MethodTypeFlowBuilder.java:184)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.flow.MethodTypeFlowBuilder.apply(MethodTypeFlowBuilder.java:583)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.flow.MethodTypeFlow.createFlowsGraph(MethodTypeFlow.java:165)
128.4   ... 13 more
128.4 Caused by: com.oracle.graal.pointsto.constraints.UnsupportedFeatureException: Discovered a type for which getDeclaringClass0 cannot be called: io.opentelemetry.javaagent.bootstrap.field.VirtualFieldImpl$java$util$concurrent$ForkJoinTask$io$opentelemetry$javaagent$bootstrap$executors$PropagatedContext. This error is reported at image build time because class io.opentelemetry.javaagent.bootstrap.field.VirtualFieldImpl$java$util$concurrent$ForkJoinTask$io$opentelemetry$javaagent$bootstrap$executors$PropagatedContext is registered for linking at image build time by system default
128.4   at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.SVMHost.handleLinkageError(SVMHost.java:463)
128.4   at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.SVMHost.getDeclaringClass(SVMHost.java:449)
128.4   at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.SVMHost.createHub(SVMHost.java:423)
128.4   at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.SVMHost.registerType(SVMHost.java:268)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.meta.AnalysisUniverse.createType(AnalysisUniverse.java:310)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.meta.AnalysisUniverse.lookupAllowUnresolved(AnalysisUniverse.java:220)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.meta.AnalysisUniverse.lookup(AnalysisUniverse.java:197)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.meta.AnalysisMethod.<init>(AnalysisMethod.java:157)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.meta.PointsToAnalysisMethod.<init>(PointsToAnalysisMethod.java:70)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.meta.PointsToAnalysisFactory.createMethod(PointsToAnalysisFactory.java:35)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.meta.AnalysisUniverse.createMethod(AnalysisUniverse.java:453)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.meta.AnalysisUniverse.lookupAllowUnresolved(AnalysisUniverse.java:441)
128.4   at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.infrastructure.WrappedConstantPool.lookupMethod(WrappedConstantPool.java:125)
128.4   at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.lookupMethodInPool(BytecodeParser.java:4249)
128.4   at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.phases.SharedGraphBuilderPhase$SharedBytecodeParser.lookupMethodInPool(SharedGraphBuilderPhase.java:197)
128.4   at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.lookupMethod(BytecodeParser.java:4236)
128.4   at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.genInvokeStatic(BytecodeParser.java:1652)
128.4   at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.processBytecode(BytecodeParser.java:5331)
128.4   at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.iterateBytecodesForBlock(BytecodeParser.java:3406)
128.4   ... 32 more
128.4 Caused by: java.lang.NoClassDefFoundError: io/opentelemetry/javaagent/tooling/field/VirtualFieldImplementationsGenerator
128.4   at java.base/java.lang.Class.getDeclaringClass0(Native Method)
128.4   at java.base/jdk.internal.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
128.4   at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
128.4   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
128.4   at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.SVMHost.getDeclaringClass(SVMHost.java:432)
128.4   ... 49 more

Do you have a branch or a build with a working version somewhere? Happy to try it out...

trask commented 11 months ago

Do you have a branch or a build with a working version somewhere? Happy to try it out...

I'm not aware of anything for Java agent instrumentation.

This is our current recommendation for instrumenting native images with OpenTelemetry: https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/8919#issuecomment-1634270947

ziyilin commented 10 months ago

I have submitted a PR for GraalVM to support agent instrumentation. It can now work with a simple OT demo. OT-agent need to do some adaption work. Currently there are two kinds of works:

  1. Rewrite the agent premain method, making it native image friendly. The premain method will be executed in the native image as well. But as bytecode transformation is no longer possible at native image runtime, all the code transformation supporting code should be disabled or removed at native image runtime.
  2. Provide native image version of transformation for JDK classes. OT-agent transformed some JDK classes, but such transformations cannot be automatically applied to native image.

I have made a preliminary OT adaption to make the demo work. Now it serves the demo project as binary jar. I plan to contribute it to the OT community in the future.