eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

cmdLineTester_criu_nonPortableRestore_7_FAILED testTimeCompensation() hang #18471

Closed JasonFengJ9 closed 11 months ago

JasonFengJ9 commented 11 months ago

Failure link

From https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/67/tapResults/

07:29:57  openjdk version "21.0.1-internal" 2023-10-17
07:29:57  OpenJDK Runtime Environment (build 21.0.1-internal-adhoc.****.BuildJDK21s390xlinuxPersonal)
07:29:57  Eclipse OpenJ9 VM (build HEAD-81bae17c368, JRE 21 Linux s390x-64-Bit Compressed References 20231115_124 (JIT enabled, AOT enabled)
07:29:57  OpenJ9   - 81bae17c368
07:29:57  OMR      - 07a175a4d23
07:29:57  JCL      - 4b666c7419b based on jdk-21.0.1+12)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

variation: -Xgcpolicy:gencon -Xgcthreads64 -XX:CheckpointGCThreads=1
JVM_OPTIONS:  -Xgcpolicy:gencon -Xgcthreads64 -XX:CheckpointGCThreads=1 

Testing: Create CRIU checkpoint image and restore once - testTimeCompensation
Test start time: 2023/11/16 08:43:26 Eastern Standard Time
Running command: bash /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/java " -Xgcpolicy:gencon -Xgcthreads64 -XX:CheckpointGCThreads=1  -Xtrace:print={j9criu,j9jcl.533} --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.TimeChangeTest testTimeCompensation 1 false false
Time spent starting: 1 milliseconds
***[TEST INFO 2023/11/16 08:48:26] ProcessKiller detected a timeout after 300000 milliseconds!***
INFO: Cannot find '/usr/bin/gdb' using 'gdb' from the path.
***[TEST INFO 2023/11/16 08:48:26] executing gdb -batch -x /tmp/debugger13618158602440569193.txt bash 29649***
java.io.IOException: Cannot run program "gdb": error=2, No such file or directory
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
    at java.base/java.lang.Runtime.exec(Runtime.java:681)
    at java.base/java.lang.Runtime.exec(Runtime.java:530)
    at Test$ProcessKiller.captureCoreForProcess(Test.java:697)
    at Test$ProcessKiller.captureCoreForProcess(Test.java:646)
    at Test$ProcessKiller.run(Test.java:596)
Caused by: java.io.IOException: error=2, No such file or directory
    at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:295)
    at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:225)
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
    ... 6 more
***[TEST INFO 2023/11/16 08:48:26] executing kill -ABRT 29649***
Time spent executing: 300056 milliseconds
Test result: FAILED
Output from test:
 [OUT] start running script
 [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
 [OUT] export LD_BIND_NOT=on
 [OUT] /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/java -XX:+EnableCRIUSupport  -Xgcpolicy:gencon -Xgcthreads64 -XX:CheckpointGCThreads=1  -Xtrace:print={j9criu,j9jcl.533} --add-exports java.base/openj9.internal.criu=ALL-UNNAMED -cp /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criu.jar org.openj9.criu.TimeChangeTest testTimeCompensation 1

---TEST RESULTS---
***[TEST INFO 2023/11/16 08:48:26] kill -ABRT signal sent***
Number of PASSED tests: 56 out of 57
***[TEST INFO 2023/11/16 08:48:26] ABRT completed***
Output from test:
Number of FAILED tests: 1 out of 57

---SUMMARY OF FAILED TESTS---
Create CRIU checkpoint image and restore once - testTimeCompensation
-----------------------------

-----------------------------------
cmdLineTester_criu_nonPortableRestore_7_FAILED

50x grinder

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/67/functional_test_output.tar.gz

1XMCURTHDINFO  Current thread
3XMTHREADINFO      "main" J9VMThread:0x00000000000A0500, omrthread_t:0x000003FF9C00A220, java/lang/Thread:0x0000000011540458, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x2, isDaemon:false)
3XMJAVALTHRCCL            jdk/internal/loader/ClassLoaders$AppClassLoader(0x000000001153DF48)
3XMTHREADINFO1            (native thread ID:0x73D6, native priority:0x5, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00041020)
3XMTHREADINFO2            (native stack address range from:0x000003FFA0736000, to:0x000003FFA0776000, size:0x40000)
3XMCPUTIME               CPU usage total: 0.192125256 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20552 (0x5048)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at jdk/internal/misc/Unsafe.park(Native Method) <--- park indefinitely at single thread mode
4XESTACKTRACE                at java/util/concurrent/locks/LockSupport.park(LockSupport.java:221)
4XESTACKTRACE                at java/util/concurrent/locks/AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:754)
4XESTACKTRACE                at java/util/concurrent/locks/AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:990)
4XESTACKTRACE                at java/util/concurrent/locks/ReentrantLock$Sync.lock(ReentrantLock.java:153(Compiled Code))
4XESTACKTRACE                at java/util/concurrent/locks/ReentrantLock.lock(ReentrantLock.java:322(Compiled Code))
4XESTACKTRACE                at jdk/internal/misc/InternalLock.lock(InternalLock.java:74(Compiled Code))
4XESTACKTRACE                at java/io/PrintStream.writeln(PrintStream.java:824)
4XESTACKTRACE                at java/io/PrintStream.println(PrintStream.java:1168)
4XESTACKTRACE                at org/openj9/test/util/TimeUtilities.showThreadCurrentTime(TimeUtilities.java:36)
4XESTACKTRACE                at org/openj9/test/util/TimeUtilities.checkElapseTime(TimeUtilities.java:47)
4XESTACKTRACE                at org/openj9/criu/TimeChangeTest$1.run(TimeChangeTest.java:155)
4XESTACKTRACE                at org/eclipse/openj9/criu/CRIUSupport.lambda$registerCheckpointHookHelper$2(CRIUSupport.java:642)
4XESTACKTRACE                at org/eclipse/openj9/criu/CRIUSupport$$Lambda/0x0000000000000000.run(Bytecode PC:8)
4XESTACKTRACE                at org/eclipse/openj9/criu/J9InternalCheckpointHookAPI$J9InternalCheckpointHook.runHook(J9InternalCheckpointHookAPI.java:143)
4XESTACKTRACE                at org/eclipse/openj9/criu/J9InternalCheckpointHookAPI.runHooks(J9InternalCheckpointHookAPI.java:98)
4XESTACKTRACE                at org/eclipse/openj9/criu/J9InternalCheckpointHookAPI.runPreCheckpointHooksSingleThread(J9InternalCheckpointHookAPI.java:107)
4XESTACKTRACE                at org/eclipse/openj9/criu/CRIUSupport.checkpointJVMImpl(Native Method)
4XESTACKTRACE                at org/eclipse/openj9/criu/CRIUSupport.checkpointJVM(CRIUSupport.java:812)
4XESTACKTRACE                at org/openj9/criu/CRIUTestUtils.checkPointJVMNoSetup(CRIUTestUtils.java:103)
4XESTACKTRACE                at org/openj9/criu/TimeChangeTest.testTimeCompensation(TimeChangeTest.java:183)
4XESTACKTRACE                at org/openj9/criu/TimeChangeTest.main(TimeChangeTest.java:74)

3XMTHREADINFO      "testTimeCompensation() preCheckpoint timer delayed 2s" J9VMThread:0x00000000003DDC00, omrthread_t:0x000003FF54010430, java/lang/Thread:0x000000008C31CEF0, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x5B, isDaemon:false)
3XMJAVALTHRCCL            jdk/internal/loader/ClassLoaders$AppClassLoader(0x000000001153DF48)
3XMTHREADINFO1            (native thread ID:0x7431, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00200001)
3XMTHREADINFO2            (native stack address range from:0x000003FF74A40000, to:0x000003FF74A80000, size:0x40000)
3XMCPUTIME               CPU usage total: 0.000816913 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=4096 (0x1000)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/lang/StringCoding.implEncodeAsciiArray(StringCoding.java:82)
4XESTACKTRACE                at java/lang/Access.encodeASCII(Access.java:425)
4XESTACKTRACE                at sun/nio/cs/UTF_8$Encoder.encodeArrayLoop(UTF_8.java:456)
4XESTACKTRACE                at sun/nio/cs/UTF_8$Encoder.encodeLoop(UTF_8.java:564)
4XESTACKTRACE                at java/nio/charset/CharsetEncoder.encode(CharsetEncoder.java:586)
4XESTACKTRACE                at sun/nio/cs/StreamEncoder.implWrite(StreamEncoder.java:370)
4XESTACKTRACE                at sun/nio/cs/StreamEncoder.implWrite(StreamEncoder.java:357)
4XESTACKTRACE                at sun/nio/cs/StreamEncoder.lockedWrite(StreamEncoder.java:158)
4XESTACKTRACE                at sun/nio/cs/StreamEncoder.write(StreamEncoder.java:139)
4XESTACKTRACE                at java/io/OutputStreamWriter.write(OutputStreamWriter.java:219)
4XESTACKTRACE                at java/io/BufferedWriter.implFlushBuffer(BufferedWriter.java:178)
4XESTACKTRACE                at java/io/BufferedWriter.flushBuffer(BufferedWriter.java:163)
4XESTACKTRACE                at java/io/PrintStream.implWriteln(PrintStream.java:848)
4XESTACKTRACE                at java/io/PrintStream.writeln(PrintStream.java:826)
4XESTACKTRACE                at java/io/PrintStream.println(PrintStream.java:1168)
4XESTACKTRACE                at org/openj9/test/util/TimeUtilities.showThreadCurrentTime(TimeUtilities.java:36)
4XESTACKTRACE                at org/openj9/test/util/TimeUtilities.checkElapseTime(TimeUtilities.java:82)
4XESTACKTRACE                at org/openj9/test/util/TimeUtilities$TimeTestTask.run(TimeUtilities.java:143)
4XESTACKTRACE                at java/util/TimerThread.mainLoop(Timer.java:605)
4XESTACKTRACE                at java/util/TimerThread.run(Timer.java:543)

FYI @tajila

JasonFengJ9 commented 11 months ago

From aqa-tests/TKG/output_17001384796330/cmdLineTester_criu_nonPortableRestore_7/testOutput

testTimeCompensation() starts, current thread name: main, Thu Nov 16 08:43:26 EST 2023, System.currentTimeMillis(): 1700142206994, System.nanoTime(): 1700142206991684294
testTimeCompensation() wait 100ms checkElapseTime starts, current thread name: main, Thu Nov 16 08:43:27 EST 2023, System.currentTimeMillis() 1700142207132, System.nanoTime() 1700142207129868528
testTimeCompensation() wait 100ms: startMillisTime (1700142207030ms) startNanoTime (1700142207027803262ns) currentMillisTime (1700142207130ms) currentNanoTime (1700142207127997600ns) elapsedMillisTime (100ms) elapsedNanoTime (100194338ns)
testTimeCompensation() wait 100ms checkElapseTime ends, current thread name: main, Thu Nov 16 08:43:27 EST 2023, System.currentTimeMillis() 1700142207147, System.nanoTime() 1700142207144163230
testTimeCompensation() sleep 100ms checkElapseTime starts, current thread name: main, Thu Nov 16 08:43:27 EST 2023, System.currentTimeMillis() 1700142207247, System.nanoTime() 1700142207244664131
testTimeCompensation() sleep 100ms: startMillisTime (1700142207147ms) startNanoTime (1700142207144209381ns) currentMillisTime (1700142207247ms) currentNanoTime (1700142207244507439ns) elapsedMillisTime (100ms) elapsedNanoTime (100298058ns)
testTimeCompensation() sleep 100ms checkElapseTime ends, current thread name: main, Thu Nov 16 08:43:27 EST 2023, System.currentTimeMillis() 1700142207248, System.nanoTime() 1700142207245103608
testTimeCompensation() preCheckpoint timer delayed 100ms checkElapseTime starts, current thread name: testTimeCompensation() preCheckpoint timer delayed 100ms, Thu Nov 16 08:43:27 EST 2023, System.currentTimeMillis() 1700142207349, System.nanoTime() 1700142207346258673
testTimeCompensation() preCheckpoint timer delayed 100ms: startMillisTime (1700142207248ms) startNanoTime (1700142207245157758ns) currentMillisTime (1700142207349ms) currentNanoTime (1700142207346123350ns) elapsedMillisTime (101ms) elapsedNanoTime (100965592ns)
testTimeCompensation() preCheckpoint timer delayed 100ms checkElapseTime ends, current thread name: testTimeCompensation() preCheckpoint timer delayed 100ms, Thu Nov 16 08:43:27 EST 2023, System.currentTimeMillis() 1700142207349, System.nanoTime() 1700142207346843329
testTimeCompensation() preCheckpoint timer delayed 2s checkElapseTime starts, current thread name: testTimeCompensation() preCheckpoint timer delayed 2s, Thu Nov 16 08:43:29 EST 2023, System.currentTimeMillis() 1700142209251, System.nanoTime() 1700142209248992994
testTimeCompensation() preCheckpoint checkElapseTime starts, current thread name: main, Thu Nov 16 08:43:29 EST 2023, System.currentTimeMillis() 1700142209257, System.nanoTime() 1700142209254262568
testTimeCompensation() preCheckpoint: startMillisTime (1700142207249ms) startNanoTime (1700142207246545888ns) currentMillisTime (1700142209257ms) currentNanoTime (1700142209254238563ns) elapsedMillisTime (2008ms) elapsedNanoTime (2007692675ns)
testTimeCompensation() preCheckpoint checkElapseTime ends, current thread name: main, Thu Nov 16 08:43:29 EST 2023, System.currentTimeMillis() 1700142209257, System.nanoTime() 1700142209254927086
testTimeCompensation() preCheckpoint timer delayed 2s: startMillisTime (1700142207248ms) startNanoTime (1700142207245851571ns) currentMillisTime (1700142209250ms) currentNanoTime (1700142209247148962ns) elapsedMillisTime (2002ms) elapsedNanoTime (2001297391ns)
Performing CRIUSupport.checkpointJVM(), current thread name: main, Thu Nov 16 08:43:29 EST 2023, System.currentTimeMillis(): 1700142209258, System.nanoTime(): 1700142209255159946
13:43:29.282*0xa0500          j9criu.9        > Java_org_eclipse_openj9_criu_CRIUSupport_checkpointJVMImpl
13:43:29.282 0xa0500          j9criu.18       - Taking a checkpoint with active clinit
JVMDUMP034I User requested Java dump using '/home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/aqa-tests/TKG/output_17001384796330/cmdLineTester_criu_nonPortableRestore_7/javacore.20231116.084329.29653.0001.txt' through CRIUSingleThreadModeJVMCRIUException
JVMDUMP010I Java dump written to /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/aqa-tests/TKG/output_17001384796330/cmdLineTester_criu_nonPortableRestore_7/javacore.20231116.084329.29653.0001.txt
13:43:29.313 0xa0500          j9criu.10       < Java_org_eclipse_openj9_criu_CRIUSupport_checkpointJVMImpl
testTimeCompensation() preCheckpoint timer delayed 2s checkElapseTime ends, current thread name: testTimeCompensation() preCheckpoint timer delayed 2s, Thu Nov 16 08:43:29 EST 2023, System.currentTimeMillis() 1700142209261, System.nanoTime() 1700142209258677000
Exception in thread "main" org.eclipse.openj9.criu.JVMCheckpointException: Exception thrown when running user pre-checkpoint
        at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.lambda$registerCheckpointHookHelper$2(CRIUSupport.java:644)
        at openj9.criu/org.eclipse.openj9.criu.J9InternalCheckpointHookAPI$J9InternalCheckpointHook.runHook(J9InternalCheckpointHookAPI.java:143)
        at openj9.criu/org.eclipse.openj9.criu.J9InternalCheckpointHookAPI.runHooks(J9InternalCheckpointHookAPI.java:98)
        at openj9.criu/org.eclipse.openj9.criu.J9InternalCheckpointHookAPI.runPreCheckpointHooksSingleThread(J9InternalCheckpointHookAPI.java:107)
        at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.checkpointJVMImpl(Native Method)
        at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.checkpointJVM(CRIUSupport.java:812)
        at org.openj9.criu.CRIUTestUtils.checkPointJVMNoSetup(CRIUTestUtils.java:103)
        at org.openj9.criu.TimeChangeTest.testTimeCompensation(TimeChangeTest.java:183)
        at org.openj9.criu.TimeChangeTest.main(TimeChangeTest.java:74)
Caused by: org.eclipse.openj9.criu.JVMCheckpointException: Blocking operation is not allowed in CRIU single thread mode.
        at java.base/jdk.internal.misc.Unsafe.park(Native Method)
        at java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:754)
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:990)
        at java.base/java.util.concurrent.locks.ReentrantLock$Sync.lock(ReentrantLock.java:153)
        at java.base/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:322)
        at java.base/jdk.internal.misc.InternalLock.lock(InternalLock.java:74)
        at java.base/java.io.PrintStream.writeln(PrintStream.java:824)
        at java.base/java.io.PrintStream.println(PrintStream.java:1168)
        at org.openj9.test.util.TimeUtilities.showThreadCurrentTime(TimeUtilities.java:36)
        at org.openj9.test.util.TimeUtilities.checkElapseTime(TimeUtilities.java:47)
        at org.openj9.criu.TimeChangeTest$1.run(TimeChangeTest.java:155)
        at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.lambda$registerCheckpointHookHelper$2(CRIUSupport.java:642)
        ... 8 more
testTimeCompensation() preCheckpoint timer delayed 4s checkElapseTime starts, current thread name: testTimeCompensation() preCheckpoint timer delayed 4s, Thu Nov 16 08:43:31 EST 2023, System.currentTimeMillis() 1700142211249, System.nanoTime() 1700142211246892314
testTimeCompensation() preCheckpoint timer delayed 4s: startMillisTime (1700142207249ms) startNanoTime (1700142207246486491ns) currentMillisTime (1700142211249ms) currentNanoTime (1700142211246728638ns) elapsedMillisTime (4000ms) elapsedNanoTime (4000242147ns)
FAILED: testTimeCompensation() preCheckpoint timer delayed 4s elapsedMillisTime (4000ms) should NOT be less than minElapsedMillisTime (6000ms)
testTimeCompensation() preCheckpoint timer delayed 4s checkElapseTime ends, current thread name: testTimeCompensation() preCheckpoint timer delayed 4s, Thu Nov 16 08:43:31 EST 2023, System.currentTimeMillis() 1700142211257, System.nanoTime() 1700142211254692885

The parking indefinitely in single thread mode was caught indeed and org.eclipse.openj9.criu.JVMCheckpointException was thrown.