eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

DirectByteBufferLoadTest_0 fails (extended.system-JDK11-aix_ppc-64_cmprssptrs) #3933

Open smlambert opened 5 years ago

smlambert commented 5 years ago

This occurred during the last system test job before the AIX machine's tmp dir filled up and ran out of disk space (reported in #3924). There were other failures in that test job, but those are due to machine not having enough space to proceed with further testing.

Have retrieved core files and can share with whoever needs them.

Snippet from job id 93 test output :

DBLT 00:18:54.506 - First failure detected by thread: load-4. Running test: JUnit[net.adoptopenjdk.test.nio2.path.PathDirectoryStreamTest]. Creating java dumps. DBLT stderr The assert subroutine failed: 0, file ../../gc_glue_java/CompactSchemeFixupObject.cpp, line 96 DBLT stderr JVMDUMP039I Processing dump event "abort", detail "" at 2018/12/05 00:18:54 - please wait. DBLT stderr JVMDUMP032I JVM requested System dump using '/home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/core.20181205.001854.17367256.0002.dmp' in response to an event STF 00:18:54.538 - Found dump at: /home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/scripts/testKitGen/../../../TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/core.20181205.001854.17367256.0002.dmp DBLT stderr core file generated - /home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/scripts/testKitGen/../../../TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/core.20181205.001854.17367256.0002.dmp DBLT stderr JVMDUMP010I System dump written to /home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/core.20181205.001854.17367256.0002.dmp DBLT stderr JVMDUMP032I JVM requested Java dump using '/home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/javacore.20181205.001854.17367256.0003.txt' in response to an event STF 00:19:06.350 - Found dump at: /home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/scripts/testKitGen/../../../TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/javacore.20181205.001854.17367256.0003.txt DBLT stderr javacore file generated - /home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/scripts/testKitGen/../../../TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/javacore.20181205.001854.17367256.0003.txt DBLT stderr JVMDUMP010I Java dump written to /home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/javacore.20181205.001854.17367256.0003.txt DBLT stderr JVMDUMP032I JVM requested Snap dump using '/home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/Snap.20181205.001854.17367256.0004.trc' in response to an event DBLT stderr JVMDUMP010I Snap dump written to /home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/Snap.20181205.001854.17367256.0004.trc DBLT stderr JVMDUMP007I JVM Requesting JIT dump using '/home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/jitdump.20181205.001854.17367256.0005.dmp' DBLT stderr JVMDUMP010I JIT dump written to /home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/jitdump.20181205.001854.17367256.0005.dmp DBLT stderr JVMDUMP013I Processed dump event "abort", detail "". STF 00:19:07.152 - Found dump at: /home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/scripts/testKitGen/../../../TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/Snap.20181205.001854.17367256.0004.trc DBLT stderr Snap file generated - /home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/scripts/testKitGen/../../../TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/results/Snap.20181205.001854.17367256.0004.trc STF 00:19:07.296 - FAILED Process DBLT ended with exit code (1) and not the expected exit code/s (0) STF 00:19:07.296 - Monitoring Report Summary: STF 00:19:07.296 - o Process DBLT has crashed unexpectedly STF 00:19:07.296 - Killing processes: DBLT STF 00:19:07.296 - o Process DBLT is not running FAILED at step 1 (Run DirectByteBuffer load test). Expected return value=0 Actual=1 at /home/u0020236/workspace/Test-extended.system-JDK11-aix_ppc-64_cmprssptrs/openjdk-tests/TestConfig/scripts/testKitGen/../../../TestConfig/test_output_15439907297540/DirectByteBufferLoadTest_0/20181205-001850-DirectByteBufferLoadTest/execute.pl line 92. STF 00:19:07.398 - FAILED execute script failed. Expected return value=0 Actual=1

smlambert commented 5 years ago

javacore.20181205.001854.17367256.0003.txt

pshipton commented 5 years ago

start with gc since The assert subroutine failed: 0, file ../../gc_glue_java/CompactSchemeFixupObject.cpp, line 96 @dmitripivkine

dmitripivkine commented 5 years ago

start with gc since The assert subroutine failed: 0, file ../../gc_glue_java/CompactSchemeFixupObject.cpp, line 96 @dmitripivkine

This assertion means non-object memory is treated as an object. Usually there are two possibilities here: stall reference or corrupted heap.

dmitripivkine commented 5 years ago

Do we have system core for this failure?

smlambert commented 5 years ago

Associated core, jitdump and trc files temporarily housed here: https://drive.google.com/open?id=1bCoK8n40oReRG74060iAgCbF-v8-Mc7e

please direct msg me if you have issues getting the files.

dmitripivkine commented 5 years ago

Thank you. I have downloaded files. However there is a problem DDR is not supported with AIX. There is not too much I can do in native debugger. I believe the only way to investigate this problem is an attempt to reproduce it in environment where DDR is available. The closest candidates would be OpenJ9 for pLinux and IBM J9 for AIX (but there is no Java 11 build)

dmitripivkine commented 5 years ago

from this core:

0x00000000ffe1cad0:  beb275b1 00000000 30253800 00000000
0x00000000ffe1cae0:  ffe06ed8 ffe06ed8 304b77ff efb22000
0x00000000ffe1caf0:  ffefb070 ffefb080 ffefb130 ffefb210
0x00000000ffe1cb00:  00000000 00000000 30128700 00000001

Looks like crash occur an attempt to treat 0xffe1caf0 as a beginning of the object. But it is not. Object seems actually starts at 0xffe1cae8 however flags byte value 0xff is illegal.

dmitripivkine commented 5 years ago

Run 120 test iterations on pLinux - no reproductions (all passes)

smlambert commented 5 years ago

Excluded for now on AIX: https://github.com/AdoptOpenJDK/openjdk-tests/pull/730

dmitripivkine commented 5 years ago

@smlambert DDR for AIX should be available now so we can investigate system core. Would you please launch grinder for this test in attempt to reproduce?

smlambert commented 5 years ago

20x grinder: https://ci.eclipse.org/openj9/view/Test/job/Test-Grinder/149/

pshipton commented 5 years ago

@smlambert I don't think the grinder worked, it just didn't run anything.

smlambert commented 5 years ago

Ha! I guess I need to point to a branch where the test is re-enabled... https://ci.eclipse.org/openj9/view/Test/job/Test-Grinder/163/

smlambert commented 5 years ago

No failures in 20x Grinder/163, or in 50x Grinder/164. Thought it occurred more regularly than that... will keep trying.