eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

JDK19 java/lang/Thread/virtual/stress/PinALot.java#id0 IllegalMonitorStateException - RuntimeException: count = 653974 #16258

Closed JasonFengJ9 closed 1 year ago

JasonFengJ9 commented 2 years ago

Failure link

From an internal build(rhel8le-rt1-4):

openjdk version "19.0.1-beta" 2022-10-18
IBM Semeru Runtime Open Edition 19.0.1+10-202211030535 (build 19.0.1-beta+10-202211030535)
Eclipse OpenJ9 VM 19.0.1+10-202211030535 (build master-a31c79403, JRE 19 Linux ppc64le-64-Bit Compressed References 20221103_70 (JIT enabled, AOT enabled)
OpenJ9   - a31c79403
OMR      - fc60df565
JCL      - 165f4c2690 based on jdk-19.0.1+10)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

[2022-11-03T06:27:44.683Z] variation: -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage Mode150
[2022-11-03T06:27:44.683Z] JVM_OPTIONS:  -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -XX:+UseCompressedOops 

[2022-11-03T06:58:37.258Z] TEST: java/lang/Thread/virtual/stress/PinALot.java#id0

[2022-11-03T06:58:37.258Z] STDERR:
[2022-11-03T06:58:37.258Z] Exception in thread "" java.lang.IllegalMonitorStateException
[2022-11-03T06:58:37.258Z]  at java.base/java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:175)
[2022-11-03T06:58:37.258Z]  at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1007)
[2022-11-03T06:58:37.258Z]  at java.base/java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:494)
[2022-11-03T06:58:37.258Z]  at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.remove(ScheduledThreadPoolExecutor.java:1063)
[2022-11-03T06:58:37.258Z]  at java.base/java.util.concurrent.ThreadPoolExecutor.remove(ThreadPoolExecutor.java:1777)
[2022-11-03T06:58:37.258Z]  at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.cancel(ScheduledThreadPoolExecutor.java:293)
[2022-11-03T06:58:37.258Z]  at java.base/java.lang.VirtualThread.cancel(VirtualThread.java:611)
[2022-11-03T06:58:37.258Z]  at java.base/java.lang.VirtualThread.parkNanos(VirtualThread.java:541)
[2022-11-03T06:58:37.258Z]  at java.base/java.lang.Access.parkVirtualThread(Access.java:502)
[2022-11-03T06:58:37.258Z]  at java.base/jdk.internal.misc.VirtualThreads.park(VirtualThreads.java:67)
[2022-11-03T06:58:37.258Z]  at java.base/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:408)
[2022-11-03T06:58:37.258Z]  at PinALot.lambda$main$0(PinALot.java:60)
[2022-11-03T06:58:37.258Z]  at java.base/java.lang.VirtualThread.run(VirtualThread.java:292)
[2022-11-03T06:58:37.258Z]  at java.base/java.lang.VirtualThread$VThreadContinuation.lambda$new$0(VirtualThread.java:179)
[2022-11-03T06:58:37.258Z]  at java.base/jdk.internal.vm.Continuation.execute(Continuation.java:171)
[2022-11-03T06:58:37.258Z] java.lang.RuntimeException: count = 653974
[2022-11-03T06:58:37.258Z]  at PinALot.main(PinALot.java:73)
[2022-11-03T06:58:37.258Z]  at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
[2022-11-03T06:58:37.258Z]  at java.base/java.lang.reflect.Method.invoke(Method.java:578)
[2022-11-03T06:58:37.258Z]  at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
[2022-11-03T06:58:37.258Z]  at java.base/java.lang.Thread.run(Thread.java:1573)
[2022-11-03T06:58:37.258Z] 
[2022-11-03T06:58:37.258Z] JavaTest Message: Test threw exception: java.lang.RuntimeException: count = 653974

[2022-11-03T06:58:37.259Z] TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: count = 653974
[2022-11-03T06:58:37.259Z] --------------------------------------------------
[2022-11-03T07:03:24.593Z] Test results: passed: 845; failed: 1
[2022-11-03T07:03:57.879Z] Report written to /home/jenkins/workspace/Test_openjdk19_j9_sanity.openjdk_ppc64le_linux_testList_0/aqa-tests/TKG/output_16674568639805/jdk_lang_0/report/html/report.html
[2022-11-03T07:03:57.879Z] Results written to /home/jenkins/workspace/Test_openjdk19_j9_sanity.openjdk_ppc64le_linux_testList_0/aqa-tests/TKG/output_16674568639805/jdk_lang_0/work
[2022-11-03T07:03:57.879Z] Error: Some tests failed or other problems occurred.
[2022-11-03T07:03:57.879Z] 
[2022-11-03T07:03:57.879Z] jdk_lang_0_FAILED

50x internal grinder - 18/100 failed

mstoodle commented 1 year ago

@a7ehuo Any luck reproducing the problem with the updated options?

a7ehuo commented 1 year ago

Although we can't reproduce the issue with a log file on VirtualThread.parkNanos(J)V, we found a partial JIT log file (quit early because of shutdown) in a passed run that has over 200 inlined methods, which is similar to the scenario in a failed run. The log shows after partialRedundancyElimination, Thread.currentThread is stored to a temp #2515 and later on is written to AbstractOwnableSynchronizer.exclusiveOwnerThread [1]. It was a virtual thread when it was stored to #2515. It would be an issue if it was changed to a carrier thread in between. This finding in the log confirms the scenario that @tajila suspected that Thread.currentThread was cached. @fengxue-IS in the comment and @0xdaryl pointed out currentThread sym ref set as immutableField. As the log shows, it could be an issue.

Two grinders are currently running. At the moment no failure is observed.

  1. Disable recognizing Thread.currentThread (Grinder 31825): Iteration 30x
  2. Remove setImmutableField on currentThread sym ref (Grinder 31840): Iteration 30x

Meanwhile, I'm also running tests to verify a proper fix.

[1]

n8505n    BBStart <block_609> (freq 178)                                                      [    0x775746bb5230] bci=[-1,6,523] rc=0 vc=1528 vn=- li=-1 udi=- nc=0
n20491n   astore  <temp slot 101>[#2515  Auto] [flags 0x7 0x0 ]                               [    0x77574481f4b0] bci=[-1,6,523] rc=0 vc=1528 vn=- li=-1 udi=1 nc=1
n34n        aload  CurrentThread[#318  MethodMeta +168] [flags 0x207 0x400 ]                  [    0x775746004430] bci=[-1,6,523] rc=2 vc=1528 vn=- li=- udi=- nc=0
n43n      ifacmpne --> block_3 BBStart at n41n ()                                             [    0x775746004700] bci=[-1,10,523] rc=0 vc=1528 vn=- li=-1 udi=- nc=2 flg=0x20
n34n        ==>aload
n40n        aload  this<'this' parm Ljava/lang/VirtualThread;>[#396  Parm] [flags 0x40000107 0x0 ] (X!=0 )  [    0x775746004610] bci=[-1,9,523] rc=1 vc=1528 vn=- li=- udi=694 nc=0 flg=0x4
n30n      BBEnd </block_609> ===== 
...
...
....
n14436n   BBStart <block_1184> (freq 108)                                                     [    0x775745ec9010] bci=[181,19,234] rc=0 vc=1528 vn=- li=-1 udi=- nc=0
n14432n   awrtbari  java/util/concurrent/locks/AbstractOwnableSynchronizer.exclusiveOwnerThread Ljava/lang/Thread;[#1910  Shadow +24] [flags 0x80607 0x0 ] (X!=0 )  [    0x775745ec8ed0] bci=[184,2,74] rc=0 vc=1528 vn=- li=-1 udi=- nc=3 flg=0x24
n14430n     aload  <temp slot 3>[#2277  Auto] [flags 0x20000007 0x0 ] (X!=0 X>=0 )            [    0x775745ec8e30] bci=[184,0,74] rc=2 vc=1528 vn=- li=- udi=824 nc=0 flg=0x104
n14431n     aload  <temp slot 101>[#2515  Auto] [flags 0x7 0x0 ] (X!=0 )                      [    0x775745ec8e80] bci=[184,1,74] rc=1 vc=1528 vn=- li=- udi=825 nc=0 flg=0x4
n14430n     ==>aload
n17386n   BBEnd </block_1184> =====   
a7ehuo commented 1 year ago

Grinder run results:

  1. Disable recognizing Thread.currentThread (Grinder 31825): Iteration 30x 29 tests, 29 ok, 0 not ok: One run exit unexpectedly. It does not look related to PinALot.java test

    java/lang/invoke/defineHiddenClass/BasicTest.java false BasicTest
    TEST RESULT: Failed. Unexpected exit from test [exit code: 143]
  2. Remove setImmutableField on currentThread sym ref (Grinder 31840): Iteration 30x 29 tests, 28 ok, 1 not ok: One run exit unexpectedly. One run with failed tests. They don't look related to PinALot.java test

(1)
Test finished: java/lang/invoke/VarHandles/VarHandleTestMethodTypeDouble.java: Failed. Unexpected exit from test [exit code: 143]
...
Test finished: java/lang/invoke/VarHandles/VarHandleTestMethodTypeFloat.java: Error. Agent communication error: java.io.EOFException; check console log for any additional details
(2)
Failed test cases: 
TEST: java/lang/annotation/Missing/MissingArrayElement/MissingEnumArrayElementTest.java
TEST: java/lang/CompressExpandTest.java
TEST: java/lang/System/LoggerFinder/modules/JDKLoggerForJDKTest.java
TEST: java/lang/System/LoggerFinder/modules/LoggerInImageTest.java
TEST: java/lang/System/LoggerFinder/modules/NamedLoggerForImageTest.java
TEST: java/lang/System/LoggerFinder/modules/NamedLoggerForJDKTest.java
Test results: passed: 855; failed: 4; error: 2

java/lang/annotation/Missing/MissingArrayElement/MissingEnumArrayElementTest.java: getAnnotation() throws an ArrayStoreException when the annotation class not present
pshipton commented 1 year ago

I don't find any issues covering those failures. They appear to be new failures we haven't seen before, which makes it likely they are introduced by the change.

tajila commented 1 year ago

@JasonFengJ9 @fengxue-IS Have you seen the failures Annabelle posted above?

JasonFengJ9 commented 1 year ago

java/lang/annotation/Missing/MissingArrayElement/MissingEnumArrayElementTest.java: getAnnotation() throws an ArrayStoreException when the annotation class not present

This is new.

Test finished: java/lang/invoke/VarHandles/VarHandleTestMethodTypeDouble.java: Failed. Unexpected exit from test [exit code: 143] Test finished: java/lang/invoke/VarHandles/VarHandleTestMethodTypeFloat.java: Error. Agent communication error: java.io.EOFException; check console log for any additional details

No existing issue, the console log might have more info.

fengxue-IS commented 1 year ago

Have not seen these issues previously

0xdaryl commented 1 year ago

I don't find any issues covering those failures. They appear to be new failures we haven't seen before, which makes it likely they are introduced by the change.

For @a7ehuo's "change 2" above, this is not functionally equivalent to what she actually implemented. "Change 2" simply removed the "immutability" setting on the field which isn't completely correct. However, her PR correctly removes it and marks it "volatile" instead.

It is probably worth running a 30x grinder with the actual PR to verify it is clean.

a7ehuo commented 1 year ago

It is probably worth running a 30x grinder with the actual PR to verify it is clean.

Sounds good! I'll start the grinder runs

a7ehuo commented 1 year ago

An update on the grinder run results:

  1. PR change: Grinder ID 31905

    • Set up to run 30x, completed 8 tests, 8 ok, 0 not ok
    • Grinder was canceled “Cancelling nested steps due to timeout”.
  2. master branch: Grinder ID 31906

    • Set up to run 30x, completed 30 tests, 16 ok, 14 not ok due to the failure in PinALot.java
  3. master branch excluding test PinALot.java: Grinder ID 31909

    • Set up to run 30x, ongoing, 8 tests run so far, 7 ok,
    • **1 failed with “make[3]: *** [jdk_lang_1] Error 143"**, which looks similar to one of the failures observed in previous Grinder runs with the experiments
a7ehuo commented 1 year ago

Another grinder on jdk_lang_1 with the PR #17009 change: Grinder ID 31915. 30x, all passed

0xdaryl commented 1 year ago

Thanks @a7ehuo.

If I have this right, the summary is:

@pshipton : I think we're ready to merge this.

0xdaryl commented 1 year ago

Re-opening until fixed in 0.37.

pshipton commented 1 year ago

Closing since https://github.com/eclipse-openj9/openj9/pull/17021 is merged for 0.37