eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 722 forks source link

System property jdk.tracePinnedThreads causes ClassCastException in Semeru 20 #17860

Closed gjdeval closed 1 year ago

gjdeval commented 1 year ago

Java -version output

openjdk version "20.0.1" 2023-04-18 IBM Semeru Runtime Open Edition 20.0.1.0 (build 20.0.1+9) Eclipse OpenJ9 VM 20.0.1.0 (build openj9-0.39.0, JRE 20 Linux amd64-64-Bit Compressed References 20230418_50 (JIT enabled, AOT enabled) OpenJ9 - 088b83604 OMR - e4f52d2e4 JCL - 6cb177ca6ca based on jdk-20.0.1+9)

Summary of problem

Using the system property to get callstacks when a virtual thread blocks while pinned causes a ClassCastException. This occurs when either of these options are present in the command line: -Djdk.tracePinnedThreads=full -Djdk.tracePinnedThreads=short

Here's an example of the exception output: java.lang.ClassCastException: java.lang.StackWalker$StackFrameImpl incompatible with java.lang.LiveStackFrame at java.base/java.lang.PinnedThreadPrinter.lambda$printStackTrace$1(PinnedThreadPrinter.java:96) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) at java.base/java.util.stream.Stream$2.forEachRemaining(Stream.java:1552) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:522) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:512) at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:239) at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) at java.base/java.lang.PinnedThreadPrinter.lambda$printStackTrace$3(PinnedThreadPrinter.java:98) at java.base/java.lang.StackWalker.walkImpl(StackWalker.java:212) at java.base/java.lang.StackWalker.walkWrapperImpl(Native Method) at java.base/java.lang.StackWalker.walk(StackWalker.java:239) at java.base/java.lang.PinnedThreadPrinter.printStackTrace(PinnedThreadPrinter.java:95) at java.base/java.lang.VirtualThread$VThreadContinuation.onPinned(VirtualThread.java:189) at java.base/jdk.internal.vm.Continuation.yield0(Continuation.java:241) at java.base/jdk.internal.vm.Continuation.yield(Continuation.java:225) at java.base/java.lang.VirtualThread.yieldContinuation(VirtualThread.java:434) at java.base/java.lang.VirtualThread.park(VirtualThread.java:572) at java.base/java.lang.Access.parkVirtualThread(Access.java:511) at java.base/jdk.internal.misc.VirtualThreads.park(VirtualThreads.java:54) at java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:369) at java.base/sun.nio.ch.Poller.pollIndirect(Poller.java:139) at java.base/sun.nio.ch.Poller.poll(Poller.java:102) at java.base/sun.nio.ch.Poller.poll(Poller.java:89) at java.base/sun.nio.ch.NioSocketImpl.park(NioSocketImpl.java:175) at java.base/sun.nio.ch.NioSocketImpl.park(NioSocketImpl.java:196) at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:590) at java.base/java.net.Socket.connect(Socket.java:666) at java.base/java.net.Socket.connect(Socket.java:600) at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:183) at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:532) at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:637) at java.base/sun.net.www.http.HttpClient.<init>(HttpClient.java:280) at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:385) at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:407) at java.base/sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1308) at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1241) at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1127) at java.base/sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:1056) at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1657) at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1581) at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:529) ... 50 more

Diagnostic files

I have logs with lots of examples of this, but they all look like the sample above, so probably no added value.

OutOfMemoryError: Java Heap Space

Does not cause OOM.

JasonFengJ9 commented 1 year ago

This looks like

hotspot specific implementation

FYI @fengxue-IS

gjdeval commented 1 year ago

Agreed, does look like the same underlying problem.

If this is a hotspot-specific implementation, I guess that means we will not try to make it work in OpenJ9?

It would be helpful to have a similar function in OpenJ9. From JEP-444:

New diagnostics assist in migrating code to virtual threads and in assessing whether you should replace a particular use of synchronized with a java.util.concurrent lock: ....

  • The system property jdk.tracePinnedThreads triggers a stack trace when a thread blocks while pinned. Running with -Djdk.tracePinnedThreads=full prints a complete stack trace when a thread blocks while pinned, highlighting native frames and frames holding monitors. Running with -Djdk.tracePinnedThreads=short limits the output to just the problematic frames.
fengxue-IS commented 1 year ago

Due to the difference in JVM's implementation, OpenJ9 have its own representation of stack frames internally which isn't compatible with OpenJDK's LiveStackFrame design.

It would be helpful to have a similar function in OpenJ9.

If we would like to support this system property, the simple approach is to create an OpenJ9 specific PinnedThreadPrinter class which is used to perform the stack trace operations.

FYI @tajila

babsingh commented 1 year ago

https://github.com/eclipse-openj9/openj9/pull/17934 was reverted due to the failures seen in https://github.com/eclipse-openj9/openj9/issues/17989.

fengxue-IS commented 1 year ago

Closing this issue as revised PR https://github.com/eclipse-openj9/openj9/pull/18000 have been merged