eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.27k stars 721 forks source link

cmdLineTester_criu_nonPortableRestore - MethodTypeDeadlockTest #15806

Closed JasonFengJ9 closed 9 months ago

JasonFengJ9 commented 2 years ago

Failure link

From an internal build(ub18x64rt1-4):

openjdk version "11.0.17-ea" 2022-10-18
IBM Semeru Runtime Open Edition 11.0.17+2 (build 11.0.17-ea+2)
Eclipse OpenJ9 VM 11.0.17+2 (build master-1719d2e01, JRE 11 Linux amd64-64-Bit Compressed References 20220826_32 (JIT enabled, AOT enabled)
OpenJ9   - 1719d2e01
OMR      - 938f0686f
JCL      - 5b5f2411fd based on jdk-11.0.17+2)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

[2022-08-27T01:48:36.433Z] variation: -Xjit -XX:+CRIURestoreNonPortableMode
[2022-08-27T01:48:36.433Z] JVM_OPTIONS:  -Xjit -XX:+CRIURestoreNonPortableMode 

[2022-08-27T01:50:18.790Z] Testing: Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
[2022-08-27T01:50:18.790Z] Test start time: 2022/08/26 21:50:18 Eastern Standard Time
[2022-08-27T01:50:18.790Z] Running command: bash /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_criu_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_criu_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_criu_testList_1/openjdkbinary/j2sdk-image/bin/java " -Xjit -XX:+CRIURestoreNonPortableMode  -Xtrace:print=j9criu.11 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
[2022-08-27T01:50:18.790Z] Time spent starting: 4 milliseconds
[2022-08-27T01:55:31.341Z] ***[TEST INFO 2022/08/26 21:55:18] ProcessKiller detected a timeout after 300000 milliseconds!***
[2022-08-27T01:55:31.341Z] ***[TEST INFO 2022/08/26 21:55:18] executing /usr/bin/gdb -batch -x /tmp/debugger16078710942077757359.txt bash 3301***
[2022-08-27T01:55:31.341Z] GDB OUT No shared libraries loaded at this time.
[2022-08-27T01:55:31.341Z] INFO: Sleep for 60000 millis before next capture.
[2022-08-27T01:56:21.287Z] ***[TEST INFO 2022/08/26 21:56:19] executing /usr/bin/gdb -batch -x /tmp/debugger16078710942077757359.txt bash 3301***
[2022-08-27T01:56:21.287Z] GDB OUT No shared libraries loaded at this time.
[2022-08-27T01:56:21.287Z] ***[TEST INFO 2022/08/26 21:56:19] executing kill -ABRT 3301***
[2022-08-27T01:56:21.287Z] ***[TEST INFO 2022/08/26 21:56:19] kill -ABRT signal sent***
[2022-08-27T01:56:21.287Z] ***[TEST INFO 2022/08/26 21:56:19] ABRT completed***
[2022-08-27T01:56:21.287Z] Output from test:
[2022-08-27T01:56:21.287Z]  [OUT] start running script
[2022-08-27T01:56:21.287Z]  [ERR] /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_criu_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh: line 35:  3305 Killed                  $2 -XX:+EnableCRIUSupport $3 -cp "$1/criu.jar" $4 $5 $6 > testOutput 2>&1
[2022-08-27T02:11:57.366Z] ***[TEST INFO 2022/08/26 22:11:48] stdout timed out***
[2022-08-27T02:11:57.366Z] ***[TEST INFO 2022/08/26 22:11:48] stderr timed out***
[2022-08-27T02:11:57.366Z] Time spent executing: 1290010 milliseconds
[2022-08-27T02:11:57.366Z] Test result: FAILED

[2022-08-27T02:11:57.366Z] ---TEST RESULTS---
[2022-08-27T02:11:57.366Z] Number of PASSED tests: 13 out of 14
[2022-08-27T02:11:57.366Z] Number of FAILED tests: 1 out of 14
[2022-08-27T02:11:57.366Z] 
[2022-08-27T02:11:57.366Z] ---SUMMARY OF FAILED TESTS---
[2022-08-27T02:11:57.366Z] Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
[2022-08-27T02:11:57.366Z] -----------------------------
[2022-08-27T02:11:57.366Z] 
[2022-08-27T02:11:57.366Z] 
[2022-08-27T02:11:57.366Z] cmdLineTester_criu_nonPortableRestore_0_FAILED

FYI @tajila

pshipton commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_x86-64_linux_Release_testList_1/54 cmdLineTester_criu_nonPortableRestore_2

Testing: Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
Test start time: 2023/03/12 01:01:34 Eastern Standard Time
Running command: bash /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Release_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Release_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Release_testList_1/openjdkbinary/j2sdk-image/bin/java " -Xjit:count=0 -XX:+CRIURestoreNonPortableMode  -Xtrace:print=j9criu.11 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
Time spent starting: 6 milliseconds
***[TEST INFO 2023/03/12 01:06:35] ProcessKiller detected a timeout after 300000 milliseconds!***
INFO: Cannot find '/usr/bin/gdb' using 'gdb' from the path.
***[TEST INFO 2023/03/12 01:06:37] executing gdb -batch -x /tmp/debugger1760337662044971390.txt bash 726182***
java.io.IOException: Cannot run program "gdb": error=2, No such file or directory
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1128)
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1071)
    at java.base/java.lang.Runtime.exec(Runtime.java:592)
    at java.base/java.lang.Runtime.exec(Runtime.java:451)
    at Test$ProcessKiller.captureCoreForProcess(Test.java:697)
    at Test$ProcessKiller.captureCoreForProcess(Test.java:646)
    at Test$ProcessKiller.run(Test.java:596)
Caused by: java.io.IOException: error=2, No such file or directory
    at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:271)
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1107)
    ... 6 more
***[TEST INFO 2023/03/12 01:06:37] executing kill -ABRT 726182***
Time spent executing: 302702 milliseconds
Test result: FAILED
***[TEST INFO 2023/03/12 01:06:37] kill -ABRT signal sent***
***[TEST INFO 2023/03/12 01:06:37] ABRT completed***
Output from test:
 [OUT] start running script
 [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
 [OUT] export LD_BIND_NOT=on
pshipton commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_0/476 cmdLineTester_criu_nonPortableRestore_0

Testing: Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
Test start time: 2023/03/18 01:45:27 Eastern Standard Time
Running command: bash /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_0/openjdkbinary/j2sdk-image/bin/java " -Xjit -XX:+CRIURestoreNonPortableMode  -Xtrace:print=j9criu.11 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
Time spent starting: 3 milliseconds
***[TEST INFO 2023/03/18 01:50:27] ProcessKiller detected a timeout after 300000 milliseconds!***
INFO: Cannot find '/usr/bin/gdb' using 'gdb' from the path.
***[TEST INFO 2023/03/18 01:50:27] executing gdb -batch -x /tmp/debugger6974466405920502996.txt bash 23444***
java.io.IOException: Cannot run program "gdb": error=2, No such file or directory
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1128)
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1071)
    at java.base/java.lang.Runtime.exec(Runtime.java:592)
    at java.base/java.lang.Runtime.exec(Runtime.java:451)
    at Test$ProcessKiller.captureCoreForProcess(Test.java:697)
    at Test$ProcessKiller.captureCoreForProcess(Test.java:646)
    at Test$ProcessKiller.run(Test.java:596)
Caused by: java.io.IOException: error=2, No such file or directory
    at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:340)
    at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:271)
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1107)
    ... 6 more
***[TEST INFO 2023/03/18 01:50:27] executing kill -ABRT 23444***
***[TEST INFO 2023/03/18 01:50:27] kill -ABRT signal sent***
***[TEST INFO 2023/03/18 01:50:27] ABRT completed***
Output from test:
 [OUT] start running script
 [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
 [OUT] export LD_BIND_NOT=on
 [ERR] /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh: line 40: 23448 Killed                  $2 -XX:+EnableCRIUSupport $3 -cp "$1/criu.jar" $4 $5 $6 > testOutput 2>&1
***[TEST INFO 2023/03/18 02:06:57] stdout timed out***
***[TEST INFO 2023/03/18 02:06:57] stderr timed out***
Time spent executing: 1290011 milliseconds
Test result: FAILED
Output from test:
pshipton commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_s390x_linux_Nightly_testList_0/496 cmdLineTester_criu_nonPortableRestore_3

Testing: Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
Test start time: 2023/03/21 07:36:58 Coordinated Universal Time
Running command: bash /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_Nightly_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_Nightly_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_Nightly_testList_0/openjdkbinary/j2sdk-image/bin/java " -Xgcpolicy:optthruput  -Xtrace:print=j9criu.11 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
Time spent starting: 0 milliseconds
***[TEST INFO 2023/03/21 07:41:58] ProcessKiller detected a timeout after 300000 milliseconds!***
***[TEST INFO 2023/03/21 07:41:58] executing /usr/bin/gdb -batch -x /tmp/debugger16960844060354836099.txt bash 3904265***
GDB OUT No shared libraries loaded at this time.
INFO: Running '/usr/bin/gdb' failed with rc = 1
GDB ERR Could not attach to process.  If your uid matches the uid of the target
GDB ERR process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
GDB ERR again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
GDB ERR ptrace: Operation not permitted.
GDB ERR /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_Nightly_testList_0/aqa-tests/TKG/output_16793824175473/cmdLineTester_criu_nonPortableRestore_3/3904265: No such file or directory.
GDB ERR /tmp/debugger16960844060354836099.txt:2: Error in sourced command file:
GDB ERR The program has no registers now.

INFO: Sleep for 60000 millis before next capture.
***[TEST INFO 2023/03/21 07:42:58] executing /usr/bin/gdb -batch -x /tmp/debugger16960844060354836099.txt bash 3904265***
GDB OUT No shared libraries loaded at this time.
INFO: Running '/usr/bin/gdb' failed with rc = 1
GDB ERR Could not attach to process.  If your uid matches the uid of the target
GDB ERR process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
GDB ERR again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
GDB ERR ptrace: Operation not permitted.
GDB ERR /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_Nightly_testList_0/aqa-tests/TKG/output_16793824175473/cmdLineTester_criu_nonPortableRestore_3/3904265: No such file or directory.
GDB ERR /tmp/debugger16960844060354836099.txt:2: Error in sourced command file:
GDB ERR The program has no registers now.

***[TEST INFO 2023/03/21 07:42:58] executing kill -ABRT 3904265***
***[TEST INFO 2023/03/21 07:42:58] kill -ABRT signal sent***
***[TEST INFO 2023/03/21 07:42:58] ABRT completed***
Output from test:
 [OUT] start running script
 [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
 [OUT] export LD_BIND_NOT=on
 [ERR] /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_Nightly_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh: line 40: 3904269 Killed                  $2 -XX:+EnableCRIUSupport $3 -cp "$1/criu.jar" $4 $5 $6 > testOutput 2>&1
***[TEST INFO 2023/03/21 07:58:28] stdout timed out***
***[TEST INFO 2023/03/21 07:58:28] stderr timed out***
Time spent executing: 1290007 milliseconds
Test result: FAILED
Output from test:
pshipton commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_s390x_linux_aot_Personal_testList_1/150 cmdLineTester_criu_nonPortableRestore_6 -Xgcpolicy:gencon -Xgcthreads1 AOT ITERATION 1

Testing: Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
Test start time: 2023/03/21 07:55:12 Eastern Standard Time
Running command: bash /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_aot_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_aot_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_aot_Personal_testList_1/openjdkbinary/j2sdk-image/bin/java " -Xgcpolicy:gencon -Xgcthreads1  -Xtrace:print=j9criu.11 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
Time spent starting: 1 milliseconds
***[TEST INFO 2023/03/21 08:00:12] ProcessKiller detected a timeout after 300000 milliseconds!***
INFO: Cannot find '/usr/bin/gdb' using 'gdb' from the path.
***[TEST INFO 2023/03/21 08:00:12] executing gdb -batch -x /tmp/debugger6041135482325507055.txt bash 11814***
java.io.IOException: Cannot run program "gdb": error=2, No such file or directory
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1128)
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1071)
    at java.base/java.lang.Runtime.exec(Runtime.java:592)
    at java.base/java.lang.Runtime.exec(Runtime.java:451)
    at Test$ProcessKiller.captureCoreForProcess(Test.java:697)
    at Test$ProcessKiller.captureCoreForProcess(Test.java:646)
    at Test$ProcessKiller.run(Test.java:596)
Caused by: java.io.IOException: error=2, No such file or directory
    at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:340)
    at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:271)
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1107)
    ... 6 more
***[TEST INFO 2023/03/21 08:00:12] executing kill -ABRT 11814***
***[TEST INFO 2023/03/21 08:00:12] kill -ABRT signal sent***
***[TEST INFO 2023/03/21 08:00:12] ABRT completed***
Output from test:
 [OUT] start running script
 [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
 [OUT] export LD_BIND_NOT=on
 [ERR] /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_aot_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh: line 40: 11818 Killed                  $2 -XX:+EnableCRIUSupport $3 -cp "$1/criu.jar" $4 $5 $6 > testOutput 2>&1
***[TEST INFO 2023/03/21 08:16:42] stdout timed out***
***[TEST INFO 2023/03/21 08:16:42] stderr timed out***
Time spent executing: 1290003 milliseconds
Test result: FAILED
Output from test:
tajila commented 1 year ago

Im going to disable the test https://github.com/eclipse-openj9/openj9/pull/16982

pshipton commented 1 year ago

Three more failures in https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_x86-64_linux_aot_Personal/147 cmdLineTester_criu_nonPortableRestore_0 -Xjit -XX:+CRIURestoreNonPortableMode AOT ITERATION 1 cmdLineTester_criu_nonPortableRestore_2 -Xjit:count=0 -XX:+CRIURestoreNonPortableMode AOT ITERATION 2 cmdLineTester_criu_nonPortableRestore_4 -Xgcpolicy:optavgpause AOT ITERATION 2

JasonFengJ9 commented 1 year ago

JDK11 s390x_linux 0.38.0 m2 build(sles15s390x-svl-rt4-1)

openjdk version "11.0.19" 2023-04-18
IBM Semeru Runtime Open Edition 11.0.19.0-m2 (build 11.0.19+6)
Eclipse OpenJ9 VM 11.0.19.0-m2 (build v0.38.0-release-353bdb06f, JRE 11 Linux s390x-64-Bit Compressed References 20230329_610 (JIT enabled, AOT enabled)
OpenJ9   - 353bdb06f
OMR      - 855813495
JCL      - b04e256c1c based on jdk-11.0.19+6)

[2023-03-29T22:49:33.121Z] variation: -Xjit -XX:+CRIURestoreNonPortableMode
[2023-03-29T22:49:33.121Z] JVM_OPTIONS:  -Xjit -XX:+CRIURestoreNonPortableMode 

[2023-03-29T22:51:16.351Z] Testing: Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
[2023-03-29T22:51:16.351Z] Test start time: 2023/03/29 15:51:16 Pacific Standard Time
[2023-03-29T22:51:16.351Z] Running command: bash /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_testList_1/openjdkbinary/j2sdk-image/bin/java " -Xjit -XX:+CRIURestoreNonPortableMode  -Xtrace:print=j9criu.11 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
[2023-03-29T22:51:16.351Z] Time spent starting: 1 milliseconds
[2023-03-29T22:56:31.501Z] ***[TEST INFO 2023/03/29 15:56:16] ProcessKiller detected a timeout after 300000 milliseconds!***
[2023-03-29T22:56:31.501Z] INFO: Cannot find '/usr/bin/gdb' using 'gdb' from the path.
[2023-03-29T22:56:31.501Z] ***[TEST INFO 2023/03/29 15:56:16] executing gdb -batch -x /tmp/debugger6164994720913785653.txt bash 16199***
[2023-03-29T22:56:31.501Z] java.io.IOException: Cannot run program "gdb": error=2, No such file or directory
[2023-03-29T22:56:31.501Z]  at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1128)
[2023-03-29T22:56:31.501Z]  at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1071)
[2023-03-29T22:56:31.501Z]  at java.base/java.lang.Runtime.exec(Runtime.java:592)
[2023-03-29T22:56:31.501Z]  at java.base/java.lang.Runtime.exec(Runtime.java:451)
[2023-03-29T22:56:31.501Z]  at Test$ProcessKiller.captureCoreForProcess(Test.java:697)
[2023-03-29T22:56:31.501Z]  at Test$ProcessKiller.captureCoreForProcess(Test.java:646)
[2023-03-29T22:56:31.501Z]  at Test$ProcessKiller.run(Test.java:596)
[2023-03-29T22:56:31.501Z] Caused by: java.io.IOException: error=2, No such file or directory
[2023-03-29T22:56:31.501Z]  at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:340)
[2023-03-29T22:56:31.501Z]  at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:271)
[2023-03-29T22:56:31.501Z]  at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1107)
[2023-03-29T22:56:31.501Z]  ... 6 more
[2023-03-29T22:56:31.501Z] ***[TEST INFO 2023/03/29 15:56:16] executing kill -ABRT 16199***
[2023-03-29T22:56:31.501Z] ***[TEST INFO 2023/03/29 15:56:16] kill -ABRT signal sent***
[2023-03-29T22:56:31.501Z] ***[TEST INFO 2023/03/29 15:56:16] ABRT completed***
[2023-03-29T22:56:31.501Z] Output from test:
[2023-03-29T22:56:31.501Z]  [OUT] start running script
[2023-03-29T22:56:31.501Z]  [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
[2023-03-29T22:56:31.501Z]  [OUT] export LD_BIND_NOT=on
[2023-03-29T22:56:31.501Z]  [ERR] /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh: line 40: 16203 Killed                  $2 -XX:+EnableCRIUSupport $3 -cp "$1/criu.jar" $4 $5 $6 > testOutput 2>&1
[2023-03-29T23:12:49.301Z] ***[TEST INFO 2023/03/29 16:12:46] stdout timed out***
[2023-03-29T23:12:49.301Z] ***[TEST INFO 2023/03/29 16:12:46] stderr timed out***
[2023-03-29T23:12:49.301Z] Time spent executing: 1290003 milliseconds
[2023-03-29T23:12:49.301Z] Test result: FAILED

50x internal grinder - 4 failures

tajila commented 1 year ago

@JasonFengJ9 can you reproduce on a machine that has gdb?

JasonFengJ9 commented 1 year ago

@tajila it is reproduced at the 50x internal grinder

Testing: Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
Test start time: 2023/03/30 15:10:18 Pacific Standard Time
Running command: bash /home/jenkins/workspace/Grinder_iteration_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Grinder_iteration_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Grinder_iteration_0/openjdkbinary/j2sdk-image/bin/java " -Xjit -XX:+CRIURestoreNonPortableMode  -Xtrace:print=j9criu.11 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
Time spent starting: 1 milliseconds
***[TEST INFO 2023/03/30 15:15:18] ProcessKiller detected a timeout after 300000 milliseconds!***
***[TEST INFO 2023/03/30 15:15:18] executing /usr/bin/gdb -batch -x /tmp/debugger15015569812178957963.txt bash 22825***
GDB OUT 0x000003ff986cddc0 in waitpid () from /lib64/libc.so.6
GDB OUT From                To                  Syms Read   Shared Object Library
GDB OUT 0x000003ff9888d420  0x000003ff9889a170  Yes (*)     /lib64/libtinfo.so.5
GDB OUT 0x000003ff98800ef0  0x000003ff98801bb8  Yes (*)     /lib64/libdl.so.2
GDB OUT 0x000003ff98622270  0x000003ff9875e038  Yes (*)     /lib64/libc.so.6
GDB OUT 0x000003ff98980cf0  0x000003ff9899cf60  Yes (*)     /lib/ld64.so.1
GDB OUT (*): Shared library is missing debugging information.
GDB OUT r0             0x8  8
GDB OUT r1             0x3ff00000000    4393751543808
GDB OUT r2             0xfffffffffffffe00   18446744073709551104
GDB OUT r3             0x3fff2efb198    4397827338648
GDB OUT r4             0x0  0
GDB OUT r5             0x0  0
GDB OUT r6             0x1  1
GDB OUT r7             0x0  0
GDB OUT r8             0x595a   22874
GDB OUT r9             0x800f3d70   2148482416
GDB OUT r10            0xffffffffffffffff   18446744073709551615
GDB OUT r11            0x3fff2efb198    4397827338648
GDB OUT r12            0x3ff987b3000    4396309753856
GDB OUT r13            0x0  0
GDB OUT r14            0x800472e2   2147775202
GDB OUT r15            0x3fff2efb028    4397827338280
GDB OUT pc             0x3ff986cddc0    0x3ff986cddc0 <waitpid+48>
GDB OUT cc             0x0  0
GDB OUT   Id   Target Id         Frame 
GDB OUT * 1    process 22825 "bash" 0x000003ff986cddc0 in waitpid () from /lib64/libc.so.6
GDB OUT 
GDB OUT Thread 1 (process 22825):
GDB OUT #0  0x000003ff986cddc0 in waitpid () from /lib64/libc.so.6
GDB OUT No symbol table info available.
GDB OUT #1  0x00000000800472e2 in waitchld.isra.10 ()
GDB OUT No symbol table info available.
GDB OUT #2  0x00000000800487ae in wait_for ()
GDB OUT No symbol table info available.
GDB OUT #3  0x00000000800388c0 in execute_command_internal ()
GDB OUT No symbol table info available.
GDB OUT #4  0x0000000080036fe2 in execute_command_internal ()
GDB OUT No symbol table info available.
GDB OUT #5  0x0000000080038922 in execute_command ()
GDB OUT No symbol table info available.
GDB OUT #6  0x0000000080037376 in execute_command_internal ()
GDB OUT No symbol table info available.
GDB OUT #7  0x0000000080036fe2 in execute_command_internal ()
GDB OUT No symbol table info available.
GDB OUT #8  0x0000000080038922 in execute_command ()
GDB OUT No symbol table info available.
GDB OUT #9  0x0000000080037488 in execute_command_internal ()
GDB OUT No symbol table info available.
GDB OUT #10 0x0000000080038922 in execute_command ()
GDB OUT No symbol table info available.
GDB OUT #11 0x000000008002067c in reader_loop ()
GDB OUT No symbol table info available.
GDB OUT #12 0x000000008001e9e4 in main ()
GDB OUT No symbol table info available.
GDB OUT Saved corefile core.22825.1
GDB OUT [Inferior 1 (process 22825) detached]
INFO: Sleep for 60000 millis before next capture.
***[TEST INFO 2023/03/30 15:16:19] executing /usr/bin/gdb -batch -x /tmp/debugger15015569812178957963.txt bash 22825***
GDB OUT 0x000003ff986cddc0 in waitpid () from /lib64/libc.so.6
GDB OUT From                To                  Syms Read   Shared Object Library
GDB OUT 0x000003ff9888d420  0x000003ff9889a170  Yes (*)     /lib64/libtinfo.so.5
GDB OUT 0x000003ff98800ef0  0x000003ff98801bb8  Yes (*)     /lib64/libdl.so.2
GDB OUT 0x000003ff98622270  0x000003ff9875e038  Yes (*)     /lib64/libc.so.6
GDB OUT 0x000003ff98980cf0  0x000003ff9899cf60  Yes (*)     /lib/ld64.so.1
GDB OUT (*): Shared library is missing debugging information.
GDB OUT r0             0x8  8
GDB OUT r1             0x3ff00000000    4393751543808
GDB OUT r2             0xfffffffffffffe00   18446744073709551104
GDB OUT r3             0x3fff2efb198    4397827338648
GDB OUT r4             0x0  0
GDB OUT r5             0x0  0
GDB OUT r6             0x1  1
GDB OUT r7             0x0  0
GDB OUT r8             0x595a   22874
GDB OUT r9             0x800f3d70   2148482416
GDB OUT r10            0xffffffffffffffff   18446744073709551615
GDB OUT r11            0x3fff2efb198    4397827338648
GDB OUT r12            0x3ff987b3000    4396309753856
GDB OUT r13            0x0  0
GDB OUT r14            0x800472e2   2147775202
GDB OUT r15            0x3fff2efb028    4397827338280
GDB OUT pc             0x3ff986cddc0    0x3ff986cddc0 <waitpid+48>
GDB OUT cc             0x0  0
GDB OUT   Id   Target Id         Frame 
GDB OUT * 1    process 22825 "bash" 0x000003ff986cddc0 in waitpid () from /lib64/libc.so.6
GDB OUT 
GDB OUT Thread 1 (process 22825):
GDB OUT #0  0x000003ff986cddc0 in waitpid () from /lib64/libc.so.6
GDB OUT No symbol table info available.
GDB OUT #1  0x00000000800472e2 in waitchld.isra.10 ()
GDB OUT No symbol table info available.
GDB OUT #2  0x00000000800487ae in wait_for ()
GDB OUT No symbol table info available.
GDB OUT #3  0x00000000800388c0 in execute_command_internal ()
GDB OUT No symbol table info available.
GDB OUT #4  0x0000000080036fe2 in execute_command_internal ()
GDB OUT No symbol table info available.
GDB OUT #5  0x0000000080038922 in execute_command ()
GDB OUT No symbol table info available.
GDB OUT #6  0x0000000080037376 in execute_command_internal ()
GDB OUT No symbol table info available.
GDB OUT #7  0x0000000080036fe2 in execute_command_internal ()
GDB OUT No symbol table info available.
GDB OUT #8  0x0000000080038922 in execute_command ()
GDB OUT No symbol table info available.
GDB OUT #9  0x0000000080037488 in execute_command_internal ()
GDB OUT No symbol table info available.
GDB OUT #10 0x0000000080038922 in execute_command ()
GDB OUT No symbol table info available.
GDB OUT #11 0x000000008002067c in reader_loop ()
GDB OUT No symbol table info available.
GDB OUT #12 0x000000008001e9e4 in main ()
GDB OUT No symbol table info available.
GDB OUT Saved corefile core.22825.2
GDB OUT A debugging session is active.
GDB OUT 
GDB OUT     Inferior 1 [process 22825] will be detached.
GDB OUT 
GDB OUT Quit anyway? (y or n) [answered Y; input not from terminal]
GDB OUT [Inferior 1 (process 22825) detached]
***[TEST INFO 2023/03/30 15:16:19] executing kill -ABRT 22825***
***[TEST INFO 2023/03/30 15:16:19] kill -ABRT signal sent***
***[TEST INFO 2023/03/30 15:16:19] ABRT completed***
Output from test:
 [OUT] start running script
 [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
 [OUT] export LD_BIND_NOT=on
 [ERR] /home/jenkins/workspace/Grinder_iteration_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh: line 40: 22829 Killed                  $2 -XX:+EnableCRIUSupport $3 -cp "$1/criu.jar" $4 $5 $6 > testOutput 2>&1
***[TEST INFO 2023/03/30 15:31:48] stdout timed out***
***[TEST INFO 2023/03/30 15:31:48] stderr timed out***
Time spent executing: 1290003 milliseconds
Test result: FAILED
Output from test:

Also raised an internal infra issue - sles15s390x-svl-rt4-1 requires GDB

tajila commented 1 year ago

@tajila it is reproduced at the 50x internal grinder

Is there a native stack trace?

JasonFengJ9 commented 1 year ago

Is there a native stack trace?

Got one at an internal grinder

GDB OUT   Id   Target Id                                            Frame 
GDB OUT * 1    Thread 0x3ffa7c72720 (LWP 1293568) "java"            0x000003ffa7a894a4 in __pthread_timedjoin_ex () from /lib64/libpthread.so.0
GDB OUT   2    Thread 0x3ffa7c71910 (LWP 1293569) "main"            0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   3    Thread 0x3ffa7b7f910 (LWP 1293570) "Signal Reporter" 0x000003ffa7a90dda in do_futex_wait.constprop () from /lib64/libpthread.so.0
GDB OUT   4    Thread 0x3ffa41fe910 (LWP 1293571) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   5    Thread 0x3ff854ff910 (LWP 1293572) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   6    Thread 0x3ff853fe910 (LWP 1293573) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   7    Thread 0x3ff852fd910 (LWP 1293574) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   8    Thread 0x3ff851fc910 (LWP 1293575) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   9    Thread 0x3ff850fb910 (LWP 1293576) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   10   Thread 0x3ff84ffa910 (LWP 1293577) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   11   Thread 0x3ff84ef9910 (LWP 1293578) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   12   Thread 0x3ff84df8910 (LWP 1293579) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   13   Thread 0x3ff84cf7910 (LWP 1293580) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   14   Thread 0x3ff84bf6910 (LWP 1293581) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   15   Thread 0x3ff84af5910 (LWP 1293582) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   16   Thread 0x3ff849f4910 (LWP 1293583) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   17   Thread 0x3ff848f3910 (LWP 1293584) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   18   Thread 0x3ff847f2910 (LWP 1293585) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   19   Thread 0x3ff846f1910 (LWP 1293586) "JIT Compilation" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   20   Thread 0x3ffa797f910 (LWP 1293587) "JIT Sampler"     0x000003ffa7a8e51e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   21   Thread 0x3ffa7ad3910 (LWP 1293588) "JIT IProfiler"   0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   22   Thread 0x3ffa76ff910 (LWP 1293593) "Common-Cleaner"  0x000003ffa7a8e51e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   23   Thread 0x3ffa737f910 (LWP 1293594) "Finalizer main"  0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   24   Thread 0x3ffa707f910 (LWP 1293597) "Concurrent Mark" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   25   Thread 0x3ffa6f7f910 (LWP 1293598) "GC Worker"       0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   26   Thread 0x3ffa6eff910 (LWP 1293599) "GC Worker"       0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   27   Thread 0x3ffa6e7f910 (LWP 1293600) "GC Worker"       0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   28   Thread 0x3ffa6d7f910 (LWP 1293602) "Finalizer threa" 0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   29   Thread 0x3ffa6cff910 (LWP 1293603) "Attach API wait" 0x000003ffa7883b9a in semop () from /lib64/libc.so.6
GDB OUT   30   Thread 0x3ffa6dff910 (LWP 1293604) "Thread-0"        0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   31   Thread 0x3ffa6c7f910 (LWP 1293605) "Thread-1"        0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   32   Thread 0x3ffa6bff910 (LWP 1293606) "Thread-2"        0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   33   Thread 0x3ffa6b7f910 (LWP 1293607) "Thread-3"        0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   34   Thread 0x3ffa6aff910 (LWP 1293608) "Thread-4"        0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   35   Thread 0x3ffa6a7f910 (LWP 1293609) "Thread-5"        0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   36   Thread 0x3ffa69ff910 (LWP 1293610) "Thread-6"        0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   37   Thread 0x3ffa697f910 (LWP 1293611) "Thread-7"        0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   38   Thread 0x3ffa667f910 (LWP 1293612) "Thread-8"        0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT   39   Thread 0x3ffa65ff910 (LWP 1293613) "Thread-9"        0x000003ffa7a8e0b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
JasonFengJ9 commented 1 year ago

20x w/ 50 TKG_ITERATIONS internal grinder passed, usually 10% failure rate. Assuming resolved.

pshipton commented 1 year ago

What change fixed it?

It failed in the 0.38 weekend builds. https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_x86-64_linux_Release_testList_0/59 https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_x86-64_linux_Release_testList_1/59/

JasonFengJ9 commented 1 year ago

I think it was

The verification grinder was on JDK11 zLinux, will run JDK11 x86-64.

pshipton commented 1 year ago

That fix was included in the weekend builds.

JasonFengJ9 commented 1 year ago

JDK11 x86 Linux(rhel8x86-rt2-4)

16:56:25  java version "11.0.19" 2023-04-18
16:56:25  IBM Semeru Runtime Certified Edition 11.0.19+6 (build 11.0.19+6)
16:56:25  Eclipse OpenJ9 VM 11.0.19+6 (build master-63e6b4f14, JRE 11 Linux amd64-64-Bit Compressed References 20230415_588 (JIT enabled, AOT enabled)
16:56:25  OpenJ9   - 63e6b4f14
16:56:25  OMR      - 77d1fa7b5
16:56:25  JCL      - 322b982f8b based on jdk-11.0.19+6)

GDB OUT Thread 30 (Thread 0x7fe359656700 (LWP 2695786)):
GDB OUT #0  0x00007fe38c5bd45c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
GDB OUT No symbol table info available.
GDB OUT #1  0x00007fe38a3649c7 in omrthread_monitor_wait () from /home/jenkins/workspace/Grinder_iteration_5/openjdkbinary/j2sdk-image/lib/default/libj9thr29.so
GDB OUT No symbol table info available.
GDB OUT #2  0x00007fe38a88e41c in internalAcquireVMAccessNoMutexWithMask () from /home/jenkins/workspace/Grinder_iteration_5/openjdkbinary/j2sdk-image/lib/default/libj9vm29.so
GDB OUT No symbol table info available.
GDB OUT #3  0x00007fe38a88e0c3 in internalAcquireVMAccess () from /home/jenkins/workspace/Grinder_iteration_5/openjdkbinary/j2sdk-image/lib/default/libj9vm29.so
GDB OUT No symbol table info available.
GDB OUT #4  0x00007fe38a874f8b in objectMonitorEnterBlocking () from /home/jenkins/workspace/Grinder_iteration_5/openjdkbinary/j2sdk-image/lib/default/libj9vm29.so
GDB OUT No symbol table info available.
GDB OUT #5  0x00007fe38a8bb62b in VM_BytecodeInterpreterCompressed::run(J9VMThread*) () from /home/jenkins/workspace/Grinder_iteration_5/openjdkbinary/j2sdk-image/lib/default/libj9vm29.so
GDB OUT No symbol table info available.

---SUMMARY OF FAILED TESTS---
Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
-----------------------------

cmdLineTester_criu_nonPortableRestore_0_FAILED(ITER_49)
JasonFengJ9 commented 1 year ago
> !threads
    !stack 0x00017000   !j9vmthread 0x00017000  !j9thread 0x7fe38401c3c0    tid 0x292249 (2695753) // (main)

    !stack 0x0024ac00   !j9vmthread 0x0024ac00  !j9thread 0x7fe38414fae0    tid 0x29226a (2695786) // (Thread-0)
    !stack 0x0024e800   !j9vmthread 0x0024e800  !j9thread 0x7fe384150058    tid 0x29226b (2695787) // (Thread-1)
    !stack 0x00252500   !j9vmthread 0x00252500  !j9thread 0x7fe38427e710    tid 0x29226c (2695788) // (Thread-2)
    !stack 0x00256100   !j9vmthread 0x00256100  !j9thread 0x7fe38427ec88    tid 0x29226d (2695789) // (Thread-3)
    !stack 0x00259e00   !j9vmthread 0x00259e00  !j9thread 0x7fe3842a9640    tid 0x29226e (2695790) // (Thread-4)
    !stack 0x0025da00   !j9vmthread 0x0025da00  !j9thread 0x7fe3842a9bb8    tid 0x29226f (2695791) // (Thread-5)
    !stack 0x00261700   !j9vmthread 0x00261700  !j9thread 0x7fe3842b2dc0    tid 0x292270 (2695792) // (Thread-6)
    !stack 0x00265300   !j9vmthread 0x00265300  !j9thread 0x7fe3842b3338    tid 0x292271 (2695793) // (Thread-7)
    !stack 0x00269000   !j9vmthread 0x00269000  !j9thread 0x7fe3842bc5b0    tid 0x292272 (2695794) // (Thread-8)
    !stack 0x00273d00   !j9vmthread 0x00273d00  !j9thread 0x7fe3842bcb28    tid 0x292273 (2695795) // (Thread-9)

> !threads flags
   !j9vmthread 0x17000 publicFlags=100 privateFlags=1008 inNative=0 // main
    !j9vmthread 0x24ac00 publicFlags=200200 privateFlags=0 inNative=0 // Thread-0
    !j9vmthread 0x24e800 publicFlags=200000 privateFlags=0 inNative=0 // Thread-1
    !j9vmthread 0x252500 publicFlags=200200 privateFlags=0 inNative=0 // Thread-2
    !j9vmthread 0x256100 publicFlags=200200 privateFlags=0 inNative=0 // Thread-3
    !j9vmthread 0x259e00 publicFlags=200200 privateFlags=0 inNative=0 // Thread-4
    !j9vmthread 0x25da00 publicFlags=200200 privateFlags=0 inNative=0 // Thread-5
    !j9vmthread 0x261700 publicFlags=200200 privateFlags=0 inNative=0 // Thread-6
    !j9vmthread 0x265300 publicFlags=200200 privateFlags=0 inNative=0 // Thread-7
    !j9vmthread 0x269000 publicFlags=200200 privateFlags=0 inNative=0 // Thread-8
    !j9vmthread 0x273d00 publicFlags=200200 privateFlags=0 inNative=0 // Thread-9

> !stack 0x00017000
<17000>                             Generic special frame
<17000>     !j9method 0x00000000002DA6B8   org/eclipse/openj9/criu/SecurityProviders.lambda$registerResetCRIUState$0()V
<17000>     !j9method 0x000000000024BF48   org/eclipse/openj9/criu/SecurityProviders$$Lambda$21/0x0000000000000000.run()V
<17000>     !j9method 0x00000000002C5758   org/eclipse/openj9/criu/J9InternalCheckpointHookAPI$J9InternalCheckpointHook.runHook()V
<17000>     !j9method 0x00000000002C52E8   org/eclipse/openj9/criu/J9InternalCheckpointHookAPI.runHooks(Ljava/util/List;Z)V
<17000>     !j9method 0x00000000002C5308   org/eclipse/openj9/criu/J9InternalCheckpointHookAPI.runPreCheckpointHooks()V
<17000>                             JNI call-in frame
<17000>     !j9method 0x00000000002B5420   org/eclipse/openj9/criu/CRIUSupport.checkpointJVMImpl(Ljava/lang/String;ZZZILjava/lang/String;ZLjava/lang/String;ZZZZLjava/lang/String;Ljava/lang/String;)V
<17000>     !j9method 0x00000000002B5780   org/eclipse/openj9/criu/CRIUSupport.checkpointJVM()V
<17000>     !j9method 0x00000000001EDD68   org/openj9/criu/CRIUTestUtils.checkPointJVM(Lorg/eclipse/openj9/criu/CRIUSupport;Ljava/nio/file/Path;Z)V
<17000>     !j9method 0x00000000001ED368   org/openj9/criu/DeadlockTest.methodTypeDeadlockTest()V
<17000>     !j9method 0x00000000001ED2E8   org/openj9/criu/DeadlockTest.main([Ljava/lang/String;)V

> !stack 0x0024e800
<24e800>                            Generic special frame
<24e800>    !j9method 0x00000000000B22B8   java/util/Arrays.copyOfRange([BII)[B
<24e800>    !j9method 0x00000000000D4F48   java/lang/StringUTF16.newString([BII)Ljava/lang/String;
<24e800>    !j9method 0x00000000000D4DC8   java/lang/StringUTF16.toLowerCase(Ljava/lang/String;[BLjava/util/Locale;)Ljava/lang/String;
<24e800>    !j9method 0x0000000000067048   java/lang/String.toLowerCase(Ljava/util/Locale;)Ljava/lang/String;
<24e800>    !j9method 0x00000000002DE138   java/security/Provider.addEngine(Ljava/lang/String;ZLjava/lang/String;)V
<24e800>    !j9method 0x00000000002DE178   java/security/Provider.<clinit>()V
<24e800>                            JNI call-in frame
<24e800>                            Generic special frame
<24e800>    !j9method 0x00000000002DB6B8   sun/security/jca/ProviderList.<clinit>()V
<24e800>                            JNI call-in frame
<24e800>                            Generic special frame
<24e800>    !j9method 0x00000000002DA268   sun/security/jca/Providers.<clinit>()V
<24e800>                            JNI call-in frame
<24e800>                            Generic special frame
<24e800>    !j9method 0x00000000002D1C70   java/security/Security.getProviders()[Ljava/security/Provider;
<24e800>    !j9method 0x00000000002D73E0   openj9/internal/criu/security/CRIUConfigurator.setCRIUSecMode(Ljava/util/Properties;)V
<24e800>    !j9method 0x00000000002D1B30   java/security/Security.initialize()V
<24e800>    !j9method 0x00000000002D1F88   java/security/Security$1.run()Ljava/lang/Void;
<24e800>    !j9method 0x00000000002D1FA8   java/security/Security$1.run()Ljava/lang/Object;
<24e800>    !j9method 0x00000000000CCBD0   java/security/AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;
<24e800>    !j9method 0x00000000002D1F10   java/security/Security.<clinit>()V
<24e800>                            JNI call-in frame
<24e800>                            Generic special frame
<24e800>    !j9method 0x00000000002D0BB8   sun/security/util/SecurityProperties.getOverridableProperty(Ljava/lang/String;)Ljava/lang/String;
<24e800>    !j9method 0x00000000002D0B98   sun/security/util/SecurityProperties.privilegedGetOverridable(Ljava/lang/String;)Ljava/lang/String;
<24e800>    !j9method 0x00000000002D0348   sun/security/util/FilePermCompat.<clinit>()V
<24e800>                            JNI call-in frame
<24e800>                            Generic special frame
<24e800>    !j9method 0x00000000002B4E78   java/io/FilePermission.init(I)V
<24e800>    !j9method 0x00000000002B4E98   java/io/FilePermission.<init>(Ljava/lang/String;Ljava/lang/String;)V
<24e800>    !j9method 0x00000000002B1EA0   sun/net/www/protocol/file/FileURLConnection.getPermission()Ljava/security/Permission;
<24e800>    !j9method 0x00000000002B05E8   sun/net/www/protocol/jar/JarFileFactory.getPermission(Ljava/util/jar/JarFile;)Ljava/security/Permission;
<24e800>    !j9method 0x00000000002B05A8   sun/net/www/protocol/jar/JarFileFactory.getCachedJarFile(Ljava/net/URL;)Ljava/util/jar/JarFile;
<24e800>    !j9method 0x00000000002B0568   sun/net/www/protocol/jar/JarFileFactory.get(Ljava/net/URL;Z)Ljava/util/jar/JarFile;
<24e800>    !j9method 0x00000000002AF948   sun/net/www/protocol/jar/JarURLConnection.connect()V
<24e800>    !j9method 0x00000000002AF968   sun/net/www/protocol/jar/JarURLConnection.getInputStream()Ljava/io/InputStream;
<24e800>    !j9method 0x00000000000FAD98   java/net/URL.openStream()Ljava/io/InputStream;
<24e800>    !j9method 0x000000000007F908   java/lang/ClassLoader.getResourceAsStream(Ljava/lang/String;)Ljava/io/InputStream;
<24e800>    !j9method 0x00000000001ED388   org/openj9/criu/DeadlockTest.getClassBytesFromResource(Ljava/lang/Class;)[B
<24e800>    !j9method 0x00000000001ED3C8   org/openj9/criu/DeadlockTest.lambda$methodTypeDeadlockTest$5(Lorg/openj9/criu/DeadlockTest$TestResult;)V
<24e800>    !j9method 0x00000000001AA798   org/openj9/criu/DeadlockTest$$Lambda$10/0x0000000000000000.run()V
<24e800>    !j9method 0x000000000007CD00   java/lang/Thread.run()V
(gdb) thread 2
[Switching to thread 2 (Thread 0x7fe38cc07700 (LWP 2695753))]
#0  0x00007fe38c5bd45c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007fe38c5bd45c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fe38a368af7 in monitor_wait_original (interruptible=0, nanos=-2077365144, millis=140615151906024, monitor=0x7fe3842dec68, self=0x7fe38401c3c0)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/omr/thread/common/omrthread.c:4686
#2  monitor_wait (interruptible=0, nanos=-2077365144, millis=140615151906024, monitor=0x7fe3842dec68) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/omr/thread/common/omrthread.c:4531
#3  omrthread_monitor_wait_timed (monitor=monitor@entry=0x7fe3842dec68, millis=millis@entry=0, nanos=nanos@entry=0) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/omr/thread/common/omrthread.c:4486
#4  0x00007fe38a889ca0 in timeCompensationHelper (vmThread=vmThread@entry=0x17000, threadHelperType=2 '\002', monitor=monitor@entry=0x7fe3842dec68, millis=millis@entry=0, nanos=nanos@entry=0)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/threadhelp.cpp:439
#5  0x00007fe38a889fd4 in monitorWaitImpl (vmThread=vmThread@entry=0x17000, object=0x0, object@entry=0xfffabd08, millis=millis@entry=0, nanos=nanos@entry=0, interruptable=interruptable@entry=0)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/threadhelp.cpp:102
#6  0x00007fe38a82be2c in classInitStateMachine (currentThread=currentThread@entry=0x17000, clazz=<optimized out>, desiredState=desiredState@entry=J9_CLASS_INIT_INITIALIZED)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:824
#7  0x00007fe38a82ca86 in classInitStateMachine (currentThread=currentThread@entry=0x17000, clazz=<optimized out>, desiredState=desiredState@entry=J9_CLASS_INIT_INITIALIZED)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:702
#8  0x00007fe38a82cd5a in initializeClass (currentThread=currentThread@entry=0x17000, clazz=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:341
#9  0x00007fe38a87cba1 in resolveStaticMethodRefInto (vmStruct=vmStruct@entry=0x17000, ramCP=ramCP@entry=0x2da4c0, cpIndex=cpIndex@entry=8, resolveFlags=resolveFlags@entry=32, ramCPEntry=ramCPEntry@entry=0x17100)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/resolvesupport.cpp:608
#10 0x00007fe38a87cd32 in resolveStaticMethodRef (vmStruct=0x17000, ramCP=0x2da4c0, cpIndex=8, resolveFlags=32) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/resolvesupport.cpp:642
#11 0x00007fe38a8ae5ad in VM_BytecodeInterpreterCompressed::initialStaticMethod (_pc=<optimized out>, _sp=<optimized out>, this=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:1976
#12 VM_BytecodeInterpreterCompressed::run (this=0x7fe38401c858, this@entry=0x7fe38cc05260, vmThread=0xfffffffffffffe00)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:10459
#13 0x00007fe38a89daa9 in bytecodeLoopCompressed (currentThread=<optimized out>) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.inc:112
#14 0x00007fe38a96be22 in c_cInterpreter () at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/build/linux-x86_64-normal-server-release/vm/runtime/vm/xcinterp.s:158
#15 0x00007fe38a824fdf in runStaticMethod (currentThread=0x7fe38a8700ba <javaLookupMethodImpl+954>, currentThread@entry=0x17000, 
    className=className@entry=0x7fe38a9fa402 <j9InternalCheckpointHookAPI_name+2> "org/eclipse/openj9/criu/J9InternalCheckpointHookAPI", selector=selector@entry=0x7fe38cc055c0, argCount=argCount@entry=0, arguments=arguments@entry=0x0)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/callin.cpp:721
#16 0x00007fe38a89c478 in jvmCheckpointHooks (currentThread=0x17000) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/CRIUHelpers.cpp:69
#17 0x00007fe358e3b3c8 in Java_org_eclipse_openj9_criu_CRIUSupport_checkpointJVMImpl (env=0x17000, unused=<optimized out>, imagesDir=<optimized out>, leaveRunning=<optimized out>, shellJob=<optimized out>, extUnixSupport=<optimized out>, logLevel=0, 
    logFile=0x0, fileLocks=1 '\001', workDir=0x0, tcpEstablished=0 '\000', autoDedup=0 '\000', trackMemory=0 '\000', unprivileged=0 '\000', optionsFile=0x0, environmentFile=0x0)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/criusupport/criusupport.cpp:828

(gdb) thread 31
[Switching to thread 31 (Thread 0x7fe359290700 (LWP 2695787))]
#0  0x00007fe38c5bd45c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007fe38c5bd45c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fe38a3649c7 in monitor_wait_original (interruptible=0, nanos=0, millis=0, monitor=0x7fe3000027a8, self=0x7fe384150058)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/omr/thread/common/omrthread.c:4686
#2  monitor_wait (interruptible=0, nanos=0, millis=0, monitor=0x7fe3000027a8) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/omr/thread/common/omrthread.c:4531
#3  omrthread_monitor_wait (monitor=0x7fe3000027a8) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/omr/thread/common/omrthread.c:4401
#4  0x00007fe38a88e41c in internalAcquireVMAccessNoMutexWithMask (vmThread=vmThread@entry=0x24e800, haltMask=haltMask@entry=2203653)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/VMAccess.cpp:386
#5  0x00007fe38a88e72a in internalAcquireVMAccessNoMutex (vmThread=vmThread@entry=0x24e800) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/VMAccess.cpp:407
#6  0x00007fe38a81fc90 in javaCheckAsyncMessages (currentThread=0x24e800, throwExceptions=1) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/AsyncMessageHandler.cpp:79
#7  0x00007fe38a8b22e0 in VM_BytecodeInterpreterCompressed::stackOverflow (_pc=<optimized out>, _sp=<optimized out>, this=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:1781
#8  VM_BytecodeInterpreterCompressed::run (this=0x7fe3841504f0, this@entry=0x7fe35928acb0, vmThread=0xfffffffffffffe00)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:10868
#9  0x00007fe38a89daa9 in bytecodeLoopCompressed (currentThread=<optimized out>) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.inc:112
#10 0x00007fe38a96be22 in c_cInterpreter () at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/build/linux-x86_64-normal-server-release/vm/runtime/vm/xcinterp.s:158
#11 0x00007fe38a822adc in sendClinit (currentThread=0x7fe38a8700ba <javaLookupMethodImpl+954>, currentThread@entry=0x24e800, clazz=clazz@entry=0x2de400)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/callin.cpp:446
#12 0x00007fe38a82b20b in initializeImpl (currentThread=currentThread@entry=0x24e800, clazz=clazz@entry=0x2de400)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:120
#13 0x00007fe38a82b621 in classInitStateMachine (currentThread=currentThread@entry=0x24e800, clazz=<optimized out>, desiredState=desiredState@entry=J9_CLASS_INIT_INITIALIZED)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:765
#14 0x00007fe38a82ca86 in classInitStateMachine (currentThread=currentThread@entry=0x24e800, clazz=<optimized out>, clazz@entry=0x2df100, desiredState=desiredState@entry=J9_CLASS_INIT_INITIALIZED)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:702
#15 0x00007fe38a82cd5a in initializeClass (currentThread=currentThread@entry=0x24e800, clazz=clazz@entry=0x2df100)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:341
#16 0x00007fe38a87c53a in resolveClassRef (vmStruct=0x24e800, ramCP=0x2daa10, cpIndex=<optimized out>, resolveFlags=65)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/resolvesupport.cpp:445
#17 0x00007fe38a8b2071 in VM_BytecodeInterpreterCompressed::newLogic (newObject=<optimized out>, _pc=<optimized out>, _sp=<optimized out>, this=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:8123
#18 VM_BytecodeInterpreterCompressed::newdup (_pc=<optimized out>, _sp=<optimized out>, this=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:8156
#19 VM_BytecodeInterpreterCompressed::run (this=0x7fe3841504f0, this@entry=0x7fe35928b5c0, vmThread=0xfffffffffffffe00)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:11520
#20 0x00007fe38a89daa9 in bytecodeLoopCompressed (currentThread=<optimized out>) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.inc:112
#21 0x00007fe38a96be22 in c_cInterpreter () at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/build/linux-x86_64-normal-server-release/vm/runtime/vm/xcinterp.s:158
#22 0x00007fe38a822adc in sendClinit (currentThread=0x7fe38a8700ba <javaLookupMethodImpl+954>, currentThread@entry=0x24e800, clazz=clazz@entry=0x2db800)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/callin.cpp:446
#23 0x00007fe38a82b20b in initializeImpl (currentThread=currentThread@entry=0x24e800, clazz=clazz@entry=0x2db800)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:120
#24 0x00007fe38a82b621 in classInitStateMachine (currentThread=currentThread@entry=0x24e800, clazz=<optimized out>, clazz@entry=0x2db800, desiredState=desiredState@entry=J9_CLASS_INIT_INITIALIZED)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:765
#25 0x00007fe38a82cd5a in initializeClass (currentThread=currentThread@entry=0x24e800, clazz=clazz@entry=0x2db800)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:341
#26 0x00007fe38a87d01a in resolveStaticFieldRefInto (vmStruct=vmStruct@entry=0x24e800, method=0x0, ramCP=ramCP@entry=0x2d9ab0, cpIndex=cpIndex@entry=46, resolveFlags=resolveFlags@entry=32, resolvedField=0x0, ramCPEntry=0x24e900)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/resolvesupport.cpp:757
#27 0x00007fe38a87d4d3 in resolveStaticFieldRef (vmStruct=0x24e800, method=<optimized out>, ramCP=0x2d9ab0, cpIndex=46, resolveFlags=32, resolvedField=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/resolvesupport.cpp:877
#28 0x00007fe38a8bc384 in VM_BytecodeInterpreterCompressed::getstatic (_pc=<optimized out>, _sp=<optimized out>, this=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:7060
#29 VM_BytecodeInterpreterCompressed::run (this=0x7fe3841504f0, this@entry=0x7fe35928bea0, vmThread=0xfffffffffffffe00)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:11490
#30 0x00007fe38a89daa9 in bytecodeLoopCompressed (currentThread=<optimized out>) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.inc:112
#31 0x00007fe38a96be22 in c_cInterpreter () at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/build/linux-x86_64-normal-server-release/vm/runtime/vm/xcinterp.s:158
#32 0x00007fe38a822adc in sendClinit (currentThread=0x7fe38a8700ba <javaLookupMethodImpl+954>, currentThread@entry=0x24e800, clazz=clazz@entry=0x2d9f00)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/callin.cpp:446
#33 0x00007fe38a82b20b in initializeImpl (currentThread=currentThread@entry=0x24e800, clazz=clazz@entry=0x2d9f00)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:120
#34 0x00007fe38a82b621 in classInitStateMachine (currentThread=currentThread@entry=0x24e800, clazz=<optimized out>, desiredState=desiredState@entry=J9_CLASS_INIT_INITIALIZED)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:765
#35 0x00007fe38a82cd5a in initializeClass (currentThread=currentThread@entry=0x24e800, clazz=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:341
#36 0x00007fe38a87cba1 in resolveStaticMethodRefInto (vmStruct=vmStruct@entry=0x24e800, ramCP=ramCP@entry=0x2d0c20, cpIndex=cpIndex@entry=110, resolveFlags=resolveFlags@entry=32, ramCPEntry=ramCPEntry@entry=0x24e900)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/resolvesupport.cpp:608
#37 0x00007fe38a87cd32 in resolveStaticMethodRef (vmStruct=0x24e800, ramCP=0x2d0c20, cpIndex=110, resolveFlags=32)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/resolvesupport.cpp:642
#38 0x00007fe38a8ae5ad in VM_BytecodeInterpreterCompressed::initialStaticMethod (_pc=<optimized out>, _sp=<optimized out>, this=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:1976
#39 VM_BytecodeInterpreterCompressed::run (this=0x7fe3841504f0, this@entry=0x7fe35928c750, vmThread=0xfffffffffffffe00)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:10459
#40 0x00007fe38a89daa9 in bytecodeLoopCompressed (currentThread=<optimized out>) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.inc:112
#41 0x00007fe38a96be22 in c_cInterpreter () at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/build/linux-x86_64-normal-server-release/vm/runtime/vm/xcinterp.s:158
#42 0x00007fe38a822adc in sendClinit (currentThread=0x86c56000, currentThread@entry=0x24e800, clazz=clazz@entry=0x2d2000)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/callin.cpp:446
#43 0x00007fe38a82b20b in initializeImpl (currentThread=currentThread@entry=0x24e800, clazz=clazz@entry=0x2d2000)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:120
#44 0x00007fe38a82b621 in classInitStateMachine (currentThread=currentThread@entry=0x24e800, clazz=<optimized out>, desiredState=desiredState@entry=J9_CLASS_INIT_INITIALIZED)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:765
#45 0x00007fe38a82cd5a in initializeClass (currentThread=currentThread@entry=0x24e800, clazz=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:341
#46 0x00007fe38a87cba1 in resolveStaticMethodRefInto (vmStruct=vmStruct@entry=0x24e800, ramCP=ramCP@entry=0x2d09c0, cpIndex=cpIndex@entry=9, resolveFlags=resolveFlags@entry=32, ramCPEntry=ramCPEntry@entry=0x24e900)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/resolvesupport.cpp:608
#47 0x00007fe38a87cd32 in resolveStaticMethodRef (vmStruct=0x24e800, ramCP=0x2d09c0, cpIndex=9, resolveFlags=32)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/resolvesupport.cpp:642
#48 0x00007fe38a8ae5ad in VM_BytecodeInterpreterCompressed::initialStaticMethod (_pc=<optimized out>, _sp=<optimized out>, this=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:1976
#49 VM_BytecodeInterpreterCompressed::run (this=0x7fe3841504f0, this@entry=0x7fe35928d000, vmThread=0xfffffffffffffe00)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:10459
#50 0x00007fe38a89daa9 in bytecodeLoopCompressed (currentThread=<optimized out>) at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.inc:112
#51 0x00007fe38a96be22 in c_cInterpreter () at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/build/linux-x86_64-normal-server-release/vm/runtime/vm/xcinterp.s:158
#52 0x00007fe38a822adc in sendClinit (currentThread=0x7fe38a8700ba <javaLookupMethodImpl+954>, currentThread@entry=0x24e800, clazz=clazz@entry=0x2d0400)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/callin.cpp:446
#53 0x00007fe38a82b20b in initializeImpl (currentThread=currentThread@entry=0x24e800, clazz=clazz@entry=0x2d0400)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:120
#54 0x00007fe38a82b621 in classInitStateMachine (currentThread=currentThread@entry=0x24e800, clazz=<optimized out>, clazz@entry=0x2d0400, desiredState=desiredState@entry=J9_CLASS_INIT_INITIALIZED)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:765
#55 0x00007fe38a82cd5a in initializeClass (currentThread=currentThread@entry=0x24e800, clazz=clazz@entry=0x2d0400)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/ClassInitialization.cpp:341
#56 0x00007fe38a87d01a in resolveStaticFieldRefInto (vmStruct=vmStruct@entry=0x24e800, method=0x0, ramCP=ramCP@entry=0x2b4710, cpIndex=cpIndex@entry=35, resolveFlags=resolveFlags@entry=32, resolvedField=0x0, ramCPEntry=0x24e900)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/resolvesupport.cpp:757
#57 0x00007fe38a87d4d3 in resolveStaticFieldRef (vmStruct=0x24e800, method=<optimized out>, ramCP=0x2b4710, cpIndex=35, resolveFlags=32, resolvedField=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/resolvesupport.cpp:877
#58 0x00007fe38a8bc384 in VM_BytecodeInterpreterCompressed::getstatic (_pc=<optimized out>, _sp=<optimized out>, this=<optimized out>)
    at /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-linux-x64-openj9-IBM/workspace/build/src/openj9/runtime/vm/BytecodeInterpreter.hpp:7060

main thread is waiting for notification https://github.com/eclipse-openj9/openj9/blob/8f80c760e721f3f55092f67fe296c6d24886e06d/runtime/vm/ClassInitialization.cpp#L824 while Thread-1 was halted at single-threaded mode.

FYI @tajila

pshipton commented 1 year ago

This was just closed, but it's still seen.

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_aarch64_linux_OMR_testList_0/368 cmdLineTester_criu_nonPortableRestore_2

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk11_j9_sanity.functional_aarch64_linux_OMR_testList_0/368/functional_test_output.tar.gz

Testing: Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
Test start time: 2023/06/22 05:40:08 Eastern Standard Time
Running command: bash /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_aarch64_linux_OMR_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_aarch64_linux_OMR_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_aarch64_linux_OMR_testList_0/openjdkbinary/j2sdk-image/bin/java " -Xjit:count=0 -XX:+CRIURestoreNonPortableMode  -XX:+ThrowOnDelayedCheckpointOperation -Xtrace:print=j9criu.17 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
Time spent starting: 16 milliseconds
***[TEST INFO 2023/06/22 05:45:08] ProcessKiller detected a timeout after 300000 milliseconds!***
INFO: Cannot find '/usr/bin/gdb' using 'gdb' from the path.
***[TEST INFO 2023/06/22 05:45:08] executing gdb -batch -x /tmp/debugger6986529856874546744.txt bash 1619355***
java.io.IOException: Cannot run program "gdb": error=2, No such file or directory
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1128)
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1071)
    at java.base/java.lang.Runtime.exec(Runtime.java:592)
    at java.base/java.lang.Runtime.exec(Runtime.java:451)
    at Test$ProcessKiller.captureCoreForProcess(Test.java:697)
    at Test$ProcessKiller.captureCoreForProcess(Test.java:646)
    at Test$ProcessKiller.run(Test.java:596)
Caused by: java.io.IOException: error=2, No such file or directory
    at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:340)
    at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:271)
    at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1107)
    ... 6 more
***[TEST INFO 2023/06/22 05:45:08] executing kill -ABRT 1619355***
***[TEST INFO 2023/06/22 05:45:08] kill -ABRT signal sent***
***[TEST INFO 2023/06/22 05:45:08] ABRT completed***
Output from test:
 [OUT] start running script
 [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
 [OUT] export LD_BIND_NOT=on
 [OUT] /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_aarch64_linux_OMR_testList_0/openjdkbinary/j2sdk-image/bin/java -XX:+EnableCRIUSupport  -Xjit:count=0 -XX:+CRIURestoreNonPortableMode  -XX:+ThrowOnDelayedCheckpointOperation -Xtrace:print=j9criu.17 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED -cp /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_aarch64_linux_OMR_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criu.jar org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
Time spent executing: 300079 milliseconds
Test result: FAILED
Output from test:
pshipton commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_0/542 cmdLineTester_criu_nonPortableRestore_0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_0/542/functional_test_output.tar.gz

Testing: Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
Test start time: 2023/06/22 04:29:08 Atlantic Standard Time
Running command: bash /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_0/openjdkbinary/j2sdk-image/bin/java " -Xjit -XX:+CRIURestoreNonPortableMode  -XX:+ThrowOnDelayedCheckpointOperation -Xtrace:print=j9criu.17 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
Time spent starting: 6 milliseconds
Time spent executing: 3062 milliseconds
Test result: FAILED
Output from test:
 [OUT] start running script
 [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
 [OUT] export LD_BIND_NOT=on
 [OUT] /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_0/openjdkbinary/j2sdk-image/bin/java -XX:+EnableCRIUSupport  -Xjit -XX:+CRIURestoreNonPortableMode  -XX:+ThrowOnDelayedCheckpointOperation -Xtrace:print=j9criu.17 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED -cp /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criu.jar org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
 [OUT] Pre-checkpoint
 [OUT] Performing CRIUSupport.checkpointJVM(), current thread name: main, Thu Jun 22 04:29:09 ADT 2023, System.currentTimeMillis(): 1687418949444, System.nanoTime(): 4209146697471705
 [OUT] 07:29:09.557*0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.558 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.561 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.562 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.568 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.569 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.571 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.572 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.573 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.575 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.576 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.579 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.580 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.582 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.583 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.585 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.586 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.588 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.589 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.591 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.592 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.594 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.606 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.607 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.609 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.610 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.612 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.613 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.615 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.616 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.618 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.619 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.621 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.622 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.624 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.626 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.627 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.629 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.630 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.632 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FC4C8 due to delay code 140126050239992
 [OUT] 07:29:09.633 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.635 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.636 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.638 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.639 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.640 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.642 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.643 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.646 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000310028 due to delay code 140126050239992
 [OUT] 07:29:09.647 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.648 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.650 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.651 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.654 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000311EA8 due to delay code 140126050239992
 [OUT] 07:29:09.656 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.657 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.659 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.660 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.666 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.668 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.669 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.671 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.672 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.674 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.675 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.677 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.678 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.680 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.681 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.683 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.684 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.686 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.687 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.688 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.690 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.691 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.693 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.694 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.696 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.697 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.699 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.700 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.702 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.703 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.705 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.707 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.708 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.709 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.711 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.712 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.714 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.715 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.717 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.718 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.719 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.721 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.723 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.724 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.725 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.727 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.728 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] 07:29:09.730 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000291D40 due to delay code 140126050239992
 [OUT] org.eclipse.openj9.criu.JVMCheckpointException: The JVM attempted to checkpoint but was unable to due to code being executed in @NotCheckpointSafe frames. Retry attempts: 100
 [OUT]  at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.checkpointJVMImpl(Native Method)
 [OUT]  at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.checkpointJVM(CRIUSupport.java:658)
 [OUT]  at org.openj9.criu.CRIUTestUtils.checkPointJVM(CRIUTestUtils.java:77)
 [OUT]  at org.openj9.criu.DeadlockTest.methodTypeDeadlockTest(DeadlockTest.java:205)
 [OUT]  at org.openj9.criu.DeadlockTest.main(DeadlockTest.java:58)
 [OUT] TEST FAILED
 [OUT] Error (criu/util.c:479): Can't read link of fd -404: No such file or directory
 [OUT] Error (criu/protobuf.c:75): Unexpected EOF on (null)
 [OUT] Removed test output files
 [OUT] finished script
>> Success condition was not found: [Output match: TEST PASSED]
>> Failure condition was found: [Output match: TEST FAILED]
>> Required condition was found: [Output match: Pre-checkpoint]
>> Success condition was found: [Output match: Checkpoint blocked because thread]
>> Required condition was not found: [Output match: Killed]
>> Failure condition was not found: [Output match: CRIU is not enabled]
>> Failure condition was not found: [Output match: Operation not permitted]
>> Success condition was not found: [Output match: Thread pid mismatch]
>> Success condition was not found: [Output match: do not match expected]
>> Success condition was not found: [Output match: Unable to create a thread:]
>> Failure condition was found: [Output match: TEST FAILED]
>> Failure condition was not found: [Output match: Could not dump the JVM processes, err=-70]
pshipton commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_1/542 cmdLineTester_criu_nonPortableRestore_4

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_1/542/functional_test_output.tar.gz

Testing: Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
Test start time: 2023/06/22 04:31:22 Atlantic Standard Time
Running command: bash /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_1/openjdkbinary/j2sdk-image/bin/java " -Xgcpolicy:optavgpause  -XX:+ThrowOnDelayedCheckpointOperation -Xtrace:print=j9criu.17 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
Time spent starting: 2 milliseconds
Time spent executing: 3033 milliseconds
Test result: FAILED
Output from test:
 [OUT] start running script
 [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
 [OUT] export LD_BIND_NOT=on
 [OUT] /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_1/openjdkbinary/j2sdk-image/bin/java -XX:+EnableCRIUSupport  -Xgcpolicy:optavgpause  -XX:+ThrowOnDelayedCheckpointOperation -Xtrace:print=j9criu.17 --add-opens java.base/jdk.internal.misc=ALL-UNNAMED  --add-exports java.base/openj9.internal.criu=ALL-UNNAMED -cp /home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_Nightly_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criu.jar org.openj9.criu.DeadlockTest MethodTypeDeadlockTest 1
 [OUT] Pre-checkpoint
 [OUT] Performing CRIUSupport.checkpointJVM(), current thread name: main, Thu Jun 22 04:31:23 ADT 2023, System.currentTimeMillis(): 1687419083216, System.nanoTime(): 4209304529479551
 [OUT] 07:31:23.299*0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.301 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.303 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.306 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.311 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FE830 due to delay code 139796426528248
 [OUT] 07:31:23.313 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FFB88 due to delay code 139796426528248
 [OUT] 07:31:23.314 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002FE830 due to delay code 139796426528248
 [OUT] 07:31:23.315 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.317 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000301490 due to delay code 139796426528248
 [OUT] 07:31:23.319 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.321 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F9B38 due to delay code 139796426528248
 [OUT] 07:31:23.323 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000003036C0 due to delay code 139796426528248
 [OUT] 07:31:23.324 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.326 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.327 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.329 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.330 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.332 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.333 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.335 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.337 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.338 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.340 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.341 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.342 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.344 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.345 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.347 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.348 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.349 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.351 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.353 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.354 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.356 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.358 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.359 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.360 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.362 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.363 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.364 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.366 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.368 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.369 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.370 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.372 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.373 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.375 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.376 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.378 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.379 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.381 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.382 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.384 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.385 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.387 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.389 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.390 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.392 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.394 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=00000000002F94C8 due to delay code 139796426528248
 [OUT] 07:31:23.395 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.397 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.399 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.401 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.403 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.404 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.405 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.407 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.408 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.410 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.412 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000307F28 due to delay code 139796426528248
 [OUT] 07:31:23.413 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.415 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.416 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.418 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.419 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.422 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=0000000000309E28 due to delay code 139796426528248
 [OUT] 07:31:23.424 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.425 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.427 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.428 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.430 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.431 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.433 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.434 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.436 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.437 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.438 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.440 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.441 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.442 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.444 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.446 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.447 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.448 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.450 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.451 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.453 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.454 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.456 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.457 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.458 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] 07:31:23.460 0x17000          j9criu.17       - Checkpoint blocked because thread=0000000000217000 is in method=000000000028B490 due to delay code 139796426528248
 [OUT] org.eclipse.openj9.criu.JVMCheckpointException: The JVM attempted to checkpoint but was unable to due to code being executed in @NotCheckpointSafe frames. Retry attempts: 100
 [OUT]  at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.checkpointJVMImpl(Native Method)
 [OUT]  at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.checkpointJVM(CRIUSupport.java:658)
 [OUT]  at org.openj9.criu.CRIUTestUtils.checkPointJVM(CRIUTestUtils.java:77)
 [OUT]  at org.openj9.criu.DeadlockTest.methodTypeDeadlockTest(DeadlockTest.java:205)
 [OUT]  at org.openj9.criu.DeadlockTest.main(DeadlockTest.java:58)
 [OUT] TEST FAILED
 [OUT] Error (criu/util.c:479): Can't read link of fd -404: No such file or directory
 [OUT] Error (criu/protobuf.c:75): Unexpected EOF on (null)
 [OUT] Removed test output files
 [OUT] finished script
>> Success condition was not found: [Output match: TEST PASSED]
>> Failure condition was found: [Output match: TEST FAILED]
>> Required condition was found: [Output match: Pre-checkpoint]
>> Success condition was found: [Output match: Checkpoint blocked because thread]
>> Required condition was not found: [Output match: Killed]
>> Failure condition was not found: [Output match: CRIU is not enabled]
>> Failure condition was not found: [Output match: Operation not permitted]
>> Success condition was not found: [Output match: Thread pid mismatch]
>> Success condition was not found: [Output match: do not match expected]
>> Success condition was not found: [Output match: Unable to create a thread:]
>> Failure condition was found: [Output match: TEST FAILED]
>> Failure condition was not found: [Output match: Could not dump the JVM processes, err=-70]
pshipton commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_x86-64_linux_aot_Personal/170/tapResults/ cmdLineTester_criu_nonPortableRestore_4 cmdLineTester_criu_nonPortableRestore_6

tajila commented 10 months ago

@singh264 Please take a look at this.

First step is to see if the issue can still be reproduced.

singh264 commented 10 months ago

@tajila The issue can still be reproduced for

tajila commented 10 months ago

Do you have the stack trace of the thread delaying the checkpoint?

singh264 commented 10 months ago

I do not have the stack trace of the thread delaying the checkpoint. I was unable to generate a system core file when the test failed as I did not include -Xdump:java+system:events=systhrow,filter=org/eclipse/openj9/criu/JVMCheckpointException when I ran the tests with make _cmdLineTester_criu_nonPortableRestore_0. To obtain a system core file and inspect the stack trace of the thread delaying the checkpoint, I am rerunning the test with make _cmdLineTester_criu_nonPortableRestore_0 EXTRA_OPTIONS=--Xdump:java+system:events=throw,filter=org/eclipse/openj9/criu/JVMCheckpointException.

tajila commented 10 months ago

The javacore

[OUT] JVMDUMP010I Java dump written to /root/openj9_issues_15806/openj9-openjdk-jdk11/openj9/test/TKG/output_16994153537388/cmdLineTester_criu_nonPortableRestore_5/javacore.20231107.195104.1697590.0001.txt

Also has Java stack traces

singh264 commented 10 months ago

Do you have the stack trace of the thread delaying the checkpoint?

During the test failure below, Thread-2 seems to be delaying the checkpoint as it is completing the method java/security/Security.<clinit>()V.

Testing: Create and Restore Criu Checkpoint Image once - MethodTypeDeadlockTest
...
 [OUT] 22:09:36.696 0x1aa00          j9criu.17       - Checkpoint blocked because thread=00000000001BBF00 is in method=000000000029F830 due to delay code 2
...
 [OUT] org.eclipse.openj9.criu.JVMCheckpointException: The JVM attempted to checkpoint but was unable to due to code being executed in @NotCheckpointSafe frames. Retry attempts: 100
 [OUT]  at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.checkpointJVMImpl(Native Method)
 [OUT]  at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.checkpointJVM(CRIUSupport.java:811)
 [OUT]  at org.openj9.criu.CRIUTestUtils.checkPointJVM(CRIUTestUtils.java:77)
 [OUT]  at org.openj9.criu.DeadlockTest.methodTypeDeadlockTest(DeadlockTest.java:205)
 [OUT]  at org.openj9.criu.DeadlockTest.main(DeadlockTest.java:58)
...
> !threads
    !stack 0x0001aa00   !j9vmthread 0x0001aa00  !j9thread 0x7fedf0007660    tid 0x57177 (356727) // (main)
    !stack 0x0001e600   !j9vmthread 0x0001e600  !j9thread 0x7fedf00d8360    tid 0x57179 (356729) // (JIT Compilation Thread-000)
    !stack 0x00022300   !j9vmthread 0x00022300  !j9thread 0x7fedf00d88d8    tid 0x5717a (356730) // (JIT Compilation Thread-001 Suspended)
    !stack 0x00025f00   !j9vmthread 0x00025f00  !j9thread 0x7fedf00dc490    tid 0x5717b (356731) // (JIT Compilation Thread-002 Suspended)
    !stack 0x00029c00   !j9vmthread 0x00029c00  !j9thread 0x7fedf00dca08    tid 0x5717c (356732) // (JIT Compilation Thread-003 Suspended)
    !stack 0x0002d800   !j9vmthread 0x0002d800  !j9thread 0x7fedf00dd9a0    tid 0x5717d (356733) // (JIT Compilation Thread-004 Suspended)
    !stack 0x00031500   !j9vmthread 0x00031500  !j9thread 0x7fedf00ddf18    tid 0x5717e (356734) // (JIT Compilation Thread-005 Suspended)
    !stack 0x00035100   !j9vmthread 0x00035100  !j9thread 0x7fedf00deeb0    tid 0x5717f (356735) // (JIT Compilation Thread-006 Suspended)
    !stack 0x00038e00   !j9vmthread 0x00038e00  !j9thread 0x7fedf00df428    tid 0x57180 (356736) // (JIT Compilation Thread-007 Suspended)
    !stack 0x0003ca00   !j9vmthread 0x0003ca00  !j9thread 0x7fedf00e03c0    tid 0x57181 (356737) // (JIT Compilation Thread-008 Suspended)
    !stack 0x00040700   !j9vmthread 0x00040700  !j9thread 0x7fedf00e0938    tid 0x57182 (356738) // (JIT Compilation Thread-009 Suspended)
    !stack 0x00044300   !j9vmthread 0x00044300  !j9thread 0x7fedf00e18d0    tid 0x57183 (356739) // (JIT Compilation Thread-010 Suspended)
    !stack 0x00048000   !j9vmthread 0x00048000  !j9thread 0x7fedf00e1e48    tid 0x57184 (356740) // (JIT Compilation Thread-011 Suspended)
    !stack 0x0004bc00   !j9vmthread 0x0004bc00  !j9thread 0x7fedf00e2de0    tid 0x57185 (356741) // (JIT Compilation Thread-012 Suspended)
    !stack 0x0004f900   !j9vmthread 0x0004f900  !j9thread 0x7fedf00e3358    tid 0x57186 (356742) // (JIT Compilation Thread-013 Suspended)
    !stack 0x00053500   !j9vmthread 0x00053500  !j9thread 0x7fedf00e42f0    tid 0x57187 (356743) // (JIT Compilation Thread-014 Suspended)
    !stack 0x00057200   !j9vmthread 0x00057200  !j9thread 0x7fedf00e4868    tid 0x57188 (356744) // (JIT Diagnostic Compilation Thread-015 Suspended)
    !stack 0x0005ae00   !j9vmthread 0x0005ae00  !j9thread 0x7fedf01392f0    tid 0x57189 (356745) // (JIT-SamplerThread)
    !stack 0x0005ea00   !j9vmthread 0x0005ea00  !j9thread 0x7fedf0139868    tid 0x5718a (356746) // (IProfiler)
    !stack 0x0015a800   !j9vmthread 0x0015a800  !j9thread 0x7fedf0165d50    tid 0x5718b (356747) // (Common-Cleaner)
    !stack 0x0018f500   !j9vmthread 0x0018f500  !j9thread 0x7fedf0222ff0    tid 0x5718d (356749) // (Concurrent Mark Helper)
    !stack 0x00193200   !j9vmthread 0x00193200  !j9thread 0x7fedf0223568    tid 0x5718e (356750) // (GC Worker)
    !stack 0x00196e00   !j9vmthread 0x00196e00  !j9thread 0x7fedf0224280    tid 0x5718f (356751) // (GC Worker)
    !stack 0x0019aa00   !j9vmthread 0x0019aa00  !j9thread 0x7fedf02247f8    tid 0x57190 (356752) // (GC Worker)
    !stack 0x0019e700   !j9vmthread 0x0019e700  !j9thread 0x7fedf0225510    tid 0x57191 (356753) // (GC Worker)
    !stack 0x001a2300   !j9vmthread 0x001a2300  !j9thread 0x7fedf0225a88    tid 0x57192 (356754) // (GC Worker)
    !stack 0x001a6000   !j9vmthread 0x001a6000  !j9thread 0x7fedf02267a0    tid 0x57193 (356755) // (GC Worker)
    !stack 0x001a9c00   !j9vmthread 0x001a9c00  !j9thread 0x7fedf0226d18    tid 0x57194 (356756) // (GC Worker)
    !stack 0x00206800   !j9vmthread 0x00206800  !j9thread 0x7fedf0229468    tid 0x57196 (356758) // (Attach API wait loop)
    !stack 0x001bd900   !j9vmthread 0x001bd900  !j9thread 0x7fedf0228ef0    tid 0x57197 (356759) // (Thread-0)
    !stack 0x001bb300   !j9vmthread 0x001bb300  !j9thread 0x7fedf0255f90    tid 0x57198 (356760) // (Thread-1)
    !stack 0x001bbf00   !j9vmthread 0x001bbf00  !j9thread 0x7fedf0256508    tid 0x57199 (356761) // (Thread-2)
    !stack 0x001bcb00   !j9vmthread 0x001bcb00  !j9thread 0x7fedf025f5c0    tid 0x5719a (356762) // (Thread-3)
    !stack 0x0021e700   !j9vmthread 0x0021e700  !j9thread 0x7fedf025fb38    tid 0x5719b (356763) // (Thread-4)
    !stack 0x00222300   !j9vmthread 0x00222300  !j9thread 0x7fedf0268e10    tid 0x5719c (356764) // (Thread-5)
    !stack 0x00225f00   !j9vmthread 0x00225f00  !j9thread 0x7fedf0269388    tid 0x5719d (356765) // (Thread-6)
    !stack 0x00229c00   !j9vmthread 0x00229c00  !j9thread 0x7fedf0275280    tid 0x5719e (356766) // (Thread-7)
    !stack 0x0022d800   !j9vmthread 0x0022d800  !j9thread 0x7fedf02757f8    tid 0x5719f (356767) // (Thread-8)
    !stack 0x00231500   !j9vmthread 0x00231500  !j9thread 0x7fedf027ead0    tid 0x571a0 (356768) // (Thread-9)
    !stack 0x00219b00   !j9vmthread 0x00219b00  !j9thread 0x7fedf027f048    tid 0x571a1 (356769) // (Finalizer thread)
> !stack 0x0001aa00 <----- main
<1aa00>     !j9method 0x00000000002A1028   org/eclipse/openj9/criu/CRIUSupport.checkpointJVMImpl(Ljava/lang/String;ZZZILjava/lang/String;ZLjava/lang/String;ZZZZLjava/lang/String;Ljava/lang/String;)V
<1aa00>     !j9method 0x00000000002A1428   org/eclipse/openj9/criu/CRIUSupport.checkpointJVM()V
<1aa00>     !j9method 0x00000000001F8268   org/openj9/criu/CRIUTestUtils.checkPointJVM(Lorg/eclipse/openj9/criu/CRIUSupport;Ljava/nio/file/Path;Z)V
<1aa00>     !j9method 0x00000000001F7058   org/openj9/criu/DeadlockTest.methodTypeDeadlockTest()V
<1aa00>     !j9method 0x00000000001F6FF8   org/openj9/criu/DeadlockTest.main([Ljava/lang/String;)V
<1aa00>                             JNI call-in frame
<1aa00>                             Native method frame
> !stack 0x001bd900 <----- Thread-0 to Thread-1, Thread-3 to Thread-9
<1bd900>    !j9method 0x0000000000291C68   sun/net/www/protocol/jar/JarFileFactory.get(Ljava/net/URL;Z)Ljava/util/jar/JarFile;
<1bd900>    !j9method 0x0000000000291048   sun/net/www/protocol/jar/JarURLConnection.connect()V
<1bd900>    !j9method 0x0000000000291068   sun/net/www/protocol/jar/JarURLConnection.getInputStream()Ljava/io/InputStream;
<1bd900>    !j9method 0x00000000000FE298   java/net/URL.openStream()Ljava/io/InputStream;
<1bd900>    !j9method 0x0000000000083408   java/lang/ClassLoader.getResourceAsStream(Ljava/lang/String;)Ljava/io/InputStream;
<1bd900>    !j9method 0x00000000001F7078   org/openj9/criu/DeadlockTest.getClassBytesFromResource(Ljava/lang/Class;)[B
<1bd900>    !j9method 0x00000000001F7138   org/openj9/criu/DeadlockTest.lambda$methodTypeDeadlockTest$4(Lorg/openj9/criu/TestResult;)V
<1bd900>    !j9method 0x00000000001BB098   org/openj9/criu/DeadlockTest$$Lambda$5/0x0000000000000000.run()V
<1bd900>    !j9method 0x0000000000080780   java/lang/Thread.run()V
<1bd900>                            JNI call-in frame
<1bd900>                            Native method frame
> !stack 0x001bbf00 <----- Thread-2
<1bbf00>    !j9method 0x00000000002A7F38   sun/security/jca/ProviderConfig.getProvider()Ljava/security/Provider;
<1bbf00>    !j9method 0x00000000002A7178   sun/security/jca/ProviderList.getProvider(Ljava/lang/String;)Ljava/security/Provider;
<1bbf00>    !j9method 0x00000000002A7058   sun/security/jca/ProviderList.remove(Lsun/security/jca/ProviderList;Ljava/lang/String;)Lsun/security/jca/ProviderList;
<1bbf00>    !j9method 0x000000000029F570   java/security/Security.removeProvider(Ljava/lang/String;)V
<1bbf00>    !j9method 0x00000000002A33E0   openj9/internal/criu/security/CRIUConfigurator.setCRIUSecMode(Ljava/util/Properties;)V
<1bbf00>    !j9method 0x000000000029F450   java/security/Security.initialize()V
<1bbf00>    !j9method 0x000000000029FD68   java/security/Security$1.run()Ljava/lang/Void;
<1bbf00>    !j9method 0x000000000029FD88   java/security/Security$1.run()Ljava/lang/Object;
<1bbf00>    !j9method 0x00000000000D07D0   java/security/AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;
<1bbf00>    !j9method 0x000000000029F830   java/security/Security.<clinit>()V
<1bbf00>                            JNI call-in frame
<1bbf00>                            Generic special frame
<1bbf00>    !j9method 0x000000000029E1B8   sun/security/util/SecurityProperties.getOverridableProperty(Ljava/lang/String;)Ljava/lang/String;
<1bbf00>    !j9method 0x000000000029E198   sun/security/util/SecurityProperties.privilegedGetOverridable(Ljava/lang/String;)Ljava/lang/String;
<1bbf00>    !j9method 0x000000000029D968   sun/security/util/FilePermCompat.<clinit>()V
<1bbf00>                            JNI call-in frame
<1bbf00>                            Generic special frame
<1bbf00>    !j9method 0x000000000029C7B8   java/io/FilePermission.init(I)V
<1bbf00>    !j9method 0x000000000029C7D8   java/io/FilePermission.<init>(Ljava/lang/String;Ljava/lang/String;)V
<1bbf00>    !j9method 0x000000000025F298   sun/net/www/protocol/file/FileURLConnection.getPermission()Ljava/security/Permission;
<1bbf00>    !j9method 0x0000000000291CE8   sun/net/www/protocol/jar/JarFileFactory.getPermission(Ljava/util/jar/JarFile;)Ljava/security/Permission;
<1bbf00>    !j9method 0x0000000000291CA8   sun/net/www/protocol/jar/JarFileFactory.getCachedJarFile(Ljava/net/URL;)Ljava/util/jar/JarFile;
<1bbf00>    !j9method 0x0000000000291C68   sun/net/www/protocol/jar/JarFileFactory.get(Ljava/net/URL;Z)Ljava/util/jar/JarFile;
<1bbf00>    !j9method 0x0000000000291048   sun/net/www/protocol/jar/JarURLConnection.connect()V
<1bbf00>    !j9method 0x0000000000291068   sun/net/www/protocol/jar/JarURLConnection.getInputStream()Ljava/io/InputStream;
<1bbf00>    !j9method 0x00000000000FE298   java/net/URL.openStream()Ljava/io/InputStream;
<1bbf00>    !j9method 0x0000000000083408   java/lang/ClassLoader.getResourceAsStream(Ljava/lang/String;)Ljava/io/InputStream;
<1bbf00>    !j9method 0x00000000001F7078   org/openj9/criu/DeadlockTest.getClassBytesFromResource(Ljava/lang/Class;)[B
<1bbf00>    !j9method 0x00000000001F7138   org/openj9/criu/DeadlockTest.lambda$methodTypeDeadlockTest$4(Lorg/openj9/criu/TestResult;)V
<1bbf00>    !j9method 0x00000000001BB098   org/openj9/criu/DeadlockTest$$Lambda$5/0x0000000000000000.run()V
<1bbf00>    !j9method 0x0000000000080780   java/lang/Thread.run()V
<1bbf00>                            JNI call-in frame
<1bbf00>                            Native method frame
tajila commented 10 months ago

Is Thread-2 blocked on anything? use !stackslots to determine which bytecode its running and whether it is blocked.

singh264 commented 10 months ago

Is Thread-2 blocked on anything? use !stackslotsto determine which bytecode its running and whether it is blocked.

The bytecodes running in Thread-2 seems to indicate that it is not blocked.

> !stackslots 0x001bbf00
...
<1bbf00> Bytecode frame: bp = 0x0000000000289F10, sp = 0x0000000000289EF8, pc = 0x00007FED79804930, cp = 0x00000000002A7620, arg0EA = 0x0000000000289F30, flags = 0x0000000000000000
<1bbf00>    Method: sun/security/jca/ProviderConfig.getProvider()Ljava/security/Provider; !j9method 0x00000000002A7F38
<1bbf00>    Bytecode index = 4 
<1bbf00>    Sync object for synchronized method
<1bbf00>        Sync O-Slot[0x0000000000289F18] = 0x00000007061F99E0
<1bbf00>    Using local mapper
<1bbf00>    Locals starting at 0x0000000000289F30 for 0x0000000000000003 slots
<1bbf00>        O-Slot: a0[0x0000000000289F30] = 0x00000007061F99E0
<1bbf00>        I-Slot: t1[0x0000000000289F28] = 0x00000007061259E8
<1bbf00>        I-Slot: t2[0x0000000000289F20] = 0x0000000000000000
<1bbf00>    Pending stack starting at 0x0000000000289EF8 for UDATA(0x0000000000000001) slots
<1bbf00>        O-Slot: p0[0x0000000000289EF8] = 0x00000007061FAD68
...
!bytecodes 0x00000000002A7F38
...
    0 aload0getfield 
    1 getfield 26 sun/security/jca/ProviderConfig.provName Ljava/lang/String;
    4 invokestatic 45 openj9/internal/security/RestrictedSecurity.isProviderAllowed(Ljava/lang/String;)Z <-----
    7 ifne 12
   10 aconstnull 
   11 syncReturn1 
...
singh264 commented 10 months ago

@tajila Could experimenting with further increasing the time that the checkpoint thread waits for the other threads that are initializing a class before taking a checkpoint be a good approach to solve the problem?

It seems like increasing the checkpoint thread's wait time to 1 ms in https://github.com/eclipse-openj9/openj9/pull/17449 was not enough to allow the other threads to complete initializing the class before the checkpoint thread threw the JVMCheckpointException.

tajila commented 10 months ago

It seems like increasing the checkpoint thread's wait time to 1 ms in https://github.com/eclipse-openj9/openj9/pull/17449 was not enough to allow the other threads to complete initializing the class before the checkpoint thread threw the JVMCheckpointException.

Give it a try

tajila commented 9 months ago

Closing this as we have address all known issues