eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

jit_jitt_array_compress_3 test failure with JITServer #14286

Closed mpirvu closed 2 years ago

mpirvu commented 2 years ago

Recently, jit_jitt_array_compress_3 started to fail on all platforms when JITServer is used Example of a failure: https://hyc-runtimes-jenkins.swg-devops.com/job/Test_openjdk8_j9_sanity.functional_x86-64_linux_jit_Personal/838/tapResults/

===============================================
Running test jit_jitt_array_compress_3 ...
===============================================
jit_jitt_array_compress_3 Start Time: Fri Jan 14 05:04:08 2022 Epoch Time (ms): 1642154648508
variation: -Xjit:disableAsyncCompilation,count=1 -XXgc:forcedShiftingCompressionAmount=4 -Xmx512m
JVM_OPTIONS: -XX:+UseJITServer -Xjit:disableAsyncCompilation,count=1 -XXgc:forcedShiftingCompressionAmount=4 -Xmx512m 
Unhandled exception
Type=Segmentation error vmState=0x00000000
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
Handler1=00007F490189F8F0 Handler2=00007F49015FDB30 InaccessibleAddress=0000000007200724
RDI=00007F48A946E189 RSI=00000000000E4AE0 RAX=0000000007200724 RBX=0000000000016F00
RCX=0000000000000004 RDX=0000000000000034 R8=7FFFFFFFFFFFFFFF R9=00007F48FC012D10
R10=00007F48FC04CE08 R11=00007F4903669040 R12=00000000000E4AD8 R13=000000000000FFFF
R14=00007F48A946E188 R15=00007F49032C86B0
RIP=00007F49018F1B37 GS=0000 FS=0000 RSP=00007F49032C8340
EFlags=0000000000010206 CS=0033 RBP=0000000000000000 ERR=0000000000000004
TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000007200724
xmm0 0000000000000005 (f: 5.000000, d: 2.470328e-323)
xmm1 000000003e3ab283 (f: 1044034176.000000, d: 5.158214e-315)
xmm2 00007f49032c86e0 (f: 53249760.000000, d: 6.914526e-310)
xmm3 00000000beabc041 (f: 3198926848.000000, d: 1.580480e-314)
xmm4 000000003e2a0f24 (f: 1042943808.000000, d: 5.152827e-315)
xmm5 000000003950f4bf (f: 961606848.000000, d: 4.750969e-315)
xmm6 0000000000000006 (f: 6.000000, d: 2.964394e-323)
xmm7 b81c120080b62b00 (f: 2159422208.000000, d: -2.062281e-38)
xmm8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm9 000000003e17cee7 (f: 1041747712.000000, d: 5.146917e-315)
xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm11 0000ff0000000000 (f: 0.000000, d: 1.385239e-309)
xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
Module=/home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_x86-64_linux_jit_Personal_testList_0/openjdkbinary/j2sdk-image/jre/lib/amd64/default/libj9vm29.so
Module_base_address=00007F4901861000
Target=2_90_20220114_932 (Linux 3.10.0-327.el7.x86_64)
CPU=amd64 (4 logical CPUs) (0x1e4447000 RAM)
----------- Stack Backtrace -----------
(0x00007F49018F1B37 [libj9vm29.so+0x90b37])
(0x00007F49018E94A5 [libj9vm29.so+0x884a5])
(0x00007F4901992E92 [libj9vm29.so+0x131e92])
---------------------------------------
JVMDUMP039I Processing dump event "gpf", detail "" at 2022/01/14 05:04:14 - please wait.
JVMDUMP032I JVM requested System dump using '/home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_x86-64_linux_jit_Personal_testList_0/aqa-tests/TKG/output_16421511853263/jit_jitt_array_compress_3/core.20220114.050414.23559.0001.dmp' in response to an event
JVMPORT030W /proc/sys/kernel/core_pattern setting "|/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h" specifies that the core dump is to be piped to an external program.  Attempting to rename either core or core.23588.

JVMDUMP010I System dump written to /home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_x86-64_linux_jit_Personal_testList_0/aqa-tests/TKG/output_16421511853263/jit_jitt_array_compress_3/core.20220114.050414.23559.0001.dmp
JVMDUMP032I JVM requested Java dump using '/home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_x86-64_linux_jit_Personal_testList_0/aqa-tests/TKG/output_16421511853263/jit_jitt_array_compress_3/javacore.20220114.050414.23559.0002.txt' in response to an event
JVMDUMP010I Java dump written to /home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_x86-64_linux_jit_Personal_testList_0/aqa-tests/TKG/output_16421511853263/jit_jitt_array_compress_3/javacore.20220114.050414.23559.0002.txt
JVMDUMP032I JVM requested Snap dump using '/home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_x86-64_linux_jit_Personal_testList_0/aqa-tests/TKG/output_16421511853263/jit_jitt_array_compress_3/Snap.20220114.050414.23559.0003.trc' in response to an event
JVMDUMP010I Snap dump written to /home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_x86-64_linux_jit_Personal_testList_0/aqa-tests/TKG/output_16421511853263/jit_jitt_array_compress_3/Snap.20220114.050414.23559.0003.trc
JVMDUMP032I JVM requested JIT dump using '/home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_x86-64_linux_jit_Personal_testList_0/aqa-tests/TKG/output_16421511853263/jit_jitt_array_compress_3/jitdump.20220114.050414.23559.0004.dmp' in response to an event
JVMDUMP051I JIT dump occurred in 'main' thread 0x0000000000016F00
JVMDUMP010I JIT dump written to /home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_x86-64_linux_jit_Personal_testList_0/aqa-tests/TKG/output_16421511853263/jit_jitt_array_compress_3/jitdump.20220114.050414.23559.0004.dmp
JVMDUMP013I Processed dump event "gpf", detail "".

jit_jitt_array_compress_3_FAILED

Optional info

mpirvu commented 2 years ago

I have reproduced the crash locally and determined that the crash is due to the -XXgc:forcedShiftingCompressionAmount=4 presence on the command line at the client. If this option is removed, the test passes.

mpirvu commented 2 years ago

I have limited the problem to just one method. I can reproduce the issue with "-Xshareclasses:none -Xjit:disableAsyncCompilation,limit={sun/net/www/ParseUtil.encodePath*},initialOptLevel=cold,disableInlining,traceFull,log=log.txt -XXgc:forcedShiftingCompressionAmount=4 -Xmx512m -XX:+UseJITServer"

mpirvu commented 2 years ago

I used compilation logs and enableJITServerFollowRemoteCompileWithLocalCompile option to perform two back to back compilations with logs, one remote and one local. I compared the two logs and post optimizations tress looks the same. I see a difference in newarray allocation inlining: local:

 [0x7f457ff7f820]   lea GPR_0047, dword ptr [2*GPR_0044+0x10]       # LEA4RegMem, SymRef [#404 +16]
 [0x7f457ff7f8b0]   Label L0052:            # label # (Start of internal control flow)
 [0x7f457ff821f0]   cmp GPR_0047, 0x3fffffff    # CMP8RegImm4
 [0x7f457ff82280]   jae Outlined Label L0054            # JAE4
 [0x7f457ff823b0]   mov GPR_0048, qword ptr [GPR_0000+0x60]     # L8RegMem, SymRef [#405 +96]
 [0x7f457ff82440]   mov GPR_0045, GPR_0047      # MOV4RegReg
 [0x7f457ff824d0]   cmp GPR_0045, 0x00000001    # CMP4RegImm4
 [0x7f457ff82560]   adc GPR_0045, 0x00000000    # ADC4RegImm4
 [0x7f457ff825f0]   shl GPR_0045, 0x01  # SHL8RegImm1
 [0x7f457ff82680]   add GPR_0045, 0x0000001f    # ADD8RegImm4
 [0x7f457ff82710]   and GPR_0045, 0xfffffffffffffff0    # AND8RegImm4

remote:

 [0x7f62c423f820]   lea GPR_0080, dword ptr [2*GPR_0060+0x10]       # LEA4RegMem, SymRef [#404 +16]
 [0x7f62c423f8b0]   Label L0080:            # label # (Start of internal control flow)
 [0x7f62c42421f0]   cmp GPR_0080, 0x3fffffff    # CMP8RegImm4
 [0x7f62c4242280]   jae Outlined Label L0082            # JAE4
 [0x7f62c42423b0]   mov GPR_0081, qword ptr [GPR_0000+0x60]     # L8RegMem, SymRef [#405 +96]
 [0x7f62c4242440]   mov GPR_0064, GPR_0080      # MOV4RegReg
 [0x7f62c42424d0]   cmp GPR_0064, 0x00000001    # CMP4RegImm4
 [0x7f62c4242560]   adc GPR_0064, 0x00000000    # ADC4RegImm4
 [0x7f62c42425f0]   shl GPR_0064, 0x01  # SHL8RegImm1
 [0x7f62c4242680]   add GPR_0064, 0x00000017    # ADD8RegImm4
 [0x7f62c4242710]   and GPR_0064, 0xfffffffffffffff8    # AND8RegImm4

I traced this code to static void genHeapAlloc2() from J9TreeEvaluator.cpp. It appears that 0x17 from [0x7f62c4242680] add GPR_0064, 0x00000017 # ADD8RegImm4 is generated from allocationSizeOrDataOffset+disp32 and disp32 is from round-1 and round is from round = (elementSize >= TR::Compiler->om.objectAlignmentInBytes())? 0 : TR::Compiler->om.objectAlignmentInBytes(); Thus, the remote compiler sees TR::Compiler->om.objectAlignmentInBytes() being 8 while the local compiler sees 16.

Interestingly enough, if I print _vmInfo->_objectAlignmentInBytes at the server, I get 16, so at this point I don't know how the server ends up with 8.

mpirvu commented 2 years ago

It turns out that the codegen uses TR::Compiler->om.getObjectAlignmentInBytes() which is a new implementation from https://github.com/eclipse-openj9/openj9/pull/14110 and is not overridden for JITServer. Thus, the server gets the wrong value of the object alignment.

mpirvu commented 2 years ago

The correct implementation for getObjectAlignmentInBytes() would include a test for JITServer, something along the lines:

int32_t
J9::ObjectModel::getObjectAlignmentInBytes()
{
#if defined(J9VM_OPT_JITSERVER)
   if (auto stream = TR::CompilationInfo::getStream())
      {
      auto *vmInfo = TR::compInfoPT->getClientData()->getOrCacheVMInfo(stream);
      return vmInfo->_objectAlignmentInBytes;
      }
#endif /* defined(J9VM_OPT_JITSERVER) */
   return _objectAlignmentInBytes;
}
mpirvu commented 2 years ago

@cjjdespres Could you please provide a fix for this issue? Thanks