eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

HCRLateAttachWorkload_0 crash vmState=0x0002000f #17133

Open pshipton opened 1 year ago

pshipton commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk8_j9_extended.system_x86-64_windows_Nightly_testList_1/494

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk8_j9_extended.system_x86-64_windows_Nightly_testList_1/494/system_test_output.tar.gz

22:56:14  LT  22:56:13.439 - Completed 3.4%. Number of tests started=513
22:56:31  STF 22:56:30.348 - Found dump at: C:\Users\jenkins\workspace\Test_openjdk8_j9_extended.system_x86-64_windows_Nightly_testList_1\aqa-tests\TKG\output_16809209393969\HCRLateAttachWorkload_0\20230407-225543-HCRLateAttachWorkload\results\core.20230407.225630.1604.0001.dmp
22:56:31  LT  stderr Unhandled exception
22:56:31  LT  stderr Type=Segmentation error vmState=0x0002000f
22:56:31  LT  stderr Windows_ExceptionCode=c0000005 J9Generic_Signal=00000004 ExceptionAddress=00007FFFA77423F3 ContextFlags=0010005f
22:56:31  LT  stderr Handler1=00007FFFAA9AEAF0 Handler2=00007FFFADD1AA50 InaccessibleReadAddress=0000000000000000
22:56:31  LT  stderr RDI=0000000000000000 RSI=0000000000000000 RAX=0000000000000000 RBX=0000027210EAA318
22:56:31  LT  stderr RCX=0000000099669966 RDX=0000000000000000 R8=00000000004E3210 R9=00000000004E03D0
22:56:31  LT  stderr R10=000000D6A53FEF70 R11=000000D6A53FED10 R12=0000000000000000 R13=000002720F763FF0
22:56:31  LT  stderr R14=000000D6A53FF300 R15=0000000000000000
22:56:31  LT  stderr RIP=00007FFFA77423F3 RSP=000000D6A53FECA0 RBP=000000D6A53FED10 EFLAGS=0000000000010246
22:56:31  LT  stderr FS=0053 ES=002B DS=002B
22:56:31  LT  stderr XMM0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM6 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr XMM15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:31  LT  stderr Module=C:\Users\jenkins\workspace\Test_openjdk8_j9_extended.system_x86-64_windows_Nightly_testList_1\openjdkbinary\j2sdk-image\jre\bin\default\j9gc29.dll
22:56:31  LT  stderr Module_base_address=00007FFFA7610000 Offset_in_DLL=00000000001323f3
22:56:31  LT  stderr Target=2_90_20230407_527 (Windows Server 2019 10.0 build 17763)
22:56:31  LT  stderr CPU=amd64 (4 logical CPUs) (0x3fff77000 RAM)
pshipton commented 1 year ago

@dmitripivkine fyi

pshipton commented 1 year ago

The native stack is

ntdll!NtWaitForSingleObject+0x14
KERNELBASE!WaitForSingleObjectEx+0x93
j9prt29!omrdump_create+0x300
j9dmp29!doSystemDump+0xa3
j9dmp29!protectedDumpFunction+0x15
j9prt29!runInTryExcept+0x16
j9prt29!omrsig_protect+0x210
j9dmp29!runDumpAgent+0x2f1
j9dmp29!triggerDumpAgents+0x53d
j9vm29!generateDiagnosticFiles+0x1ef
j9prt29!runInTryExcept+0x16
j9prt29!omrsig_protect+0x210
j9vm29!vmSignalHandler+0x1d2
j9vm29!structuredSignalHandlerVM+0x41
j9prt29!mainVectoredExceptionHandler+0x154
ntdll!RtlInitializeCriticalSectionAndSpinCount+0x1c6
ntdll!RtlWalkFrameChain+0x1119
ntdll!KiUserExceptionDispatcher+0x2e
dmitripivkine commented 1 year ago

GC Check discovers the problem:

Checking THREAD STACKS...  <gc check (1): from debugger: THREAD STACKS: slot 36f900(4e3370) -> 4e3210: class pointer is null>

> !stackslots 0x36f900
<36f900> *** BEGIN STACK WALK, flags = 00400001 walkThread = 0x000000000036F900 ***
<36f900>    ITERATE_O_SLOTS
<36f900>    RECORD_BYTECODE_PC_OFFSET
<36f900> Initial values: walkSP = 0x00000000004E32B8, PC = 0x0000000000000006, literals = 0x0000000000000000, A0 = 0x00000000004E3378, j2iFrame = 0x0000000000000000, ELS = 0x000000D6A60FFB50, decomp = 0x0000000000000000
<36f900> JIT JNI call-out frame: bp = 0x00000000004E32D8, sp = 0x00000000004E32B8, pc = 0x0000000000000006, cp = 0x000000000004F390, arg0EA = 0x00000000004E3378, flags = 0x0000000020000000
<36f900>    Method: java/lang/Thread.sleepImpl(JI)V !j9method 0x00000000000505B0
<36f900> JIT inline frame: bp = 0x00000000004E3328, pc = 0x00007FFF9408EBB8, unwindSP = 0x00000000004E32E0, cp = 0x000000000004F390, arg0EA = 0x0000000000000000, jitInfo = 0x00000272249780A8
<36f900>    Method: java/lang/Thread.sleep(JI)V !j9method 0x0000000000050590
<36f900>    Bytecode index = 2, inlineDepth = 2, PC offset = 0x00007FFF9408E6ED
<36f900> JIT inline frame: bp = 0x00000000004E3328, pc = 0x00007FFF9408EBB8, unwindSP = 0x00000000004E32E0, cp = 0x000000000004F390, arg0EA = 0x0000000000000000, jitInfo = 0x00000272249780A8
<36f900>    Method: java/lang/Thread.sleep(J)V !j9method 0x0000000000050570
<36f900>    Bytecode index = 2, inlineDepth = 1, PC offset = 0x00007FFF9408E6ED
<36f900> JIT frame: bp = 0x00000000004E3328, pc = 0x00007FFF9408EBB8, unwindSP = 0x00000000004E32E0, cp = 0x00000000001E53C0, arg0EA = 0x00000000004E3340, jitInfo = 0x00000272249780A8
<36f900>    Method: net/adoptopenjdk/test/hcrAgent/agent/TransformerMakerThread.sleepNow(J)V !j9method 0x00000000001E4F88
<36f900>    Bytecode index = 2, inlineDepth = 0, PC offset = 0x00000000000000D8
<36f900>    stackMap=0x00000272249781DB, slots=I16(0x0003) parmBaseOffset=I16(0x0020), parmSlots=U16(0x0000), localBaseOffset=I16(0xFFF0)
<36f900>    Described JIT temps starting at 0x00000000004E3318 for IDATA(0x0000000000000002) slots
<36f900>        O-Slot: : t1[0x00000000004E3318] = 0x0000000000000000
<36f900>        O-Slot: : t0[0x00000000004E3320] = 0x0000000000000000
<36f900>    JIT-RegisterMap = UDATA(0x0000000000000000)
<36f900>    JIT-Frame-RegisterMap[0x00000000004E3308] = UDATA(0x00000007FF3BE3C0) (jit_rbx)
<36f900>    JIT-Frame-RegisterMap[0x00000000004E3310] = UDATA(0x0000000000000027) (jit_r9)
<36f900> I2J values: PC = 0x0000027224CD3517, A0 = 0x00000000004E3378, walkSP = 0x00000000004E3350, literals = 0x00000000001E4F68, JIT PC = 0x00007FFFA8048BB0, pcAddress = 0x000000D6A60FFB78, decomp = 0x0000000000000000
<36f900> Bytecode frame: bp = 0x00000000004E3360, sp = 0x00000000004E3350, pc = 0x0000027224CD3517, cp = 0x00000000001E53C0, arg0EA = 0x00000000004E3378, flags = 0x0000000000000000
<36f900>    Method: net/adoptopenjdk/test/hcrAgent/agent/TransformerMakerThread.run()V !j9method 0x00000000001E4F68
<36f900>    Bytecode index = 123
<36f900>    Using local mapper
<36f900>    Locals starting at 0x00000000004E3378 for 0x0000000000000003 slots
<36f900>        O-Slot: a0[0x00000000004E3378] = 0x00000007FE513640
<36f900>        O-Slot: t1[0x00000000004E3370] = 0x00000000004E3210 <-- problematic O-slot
<36f900>        I-Slot: t2[0x00000000004E3368] = 0x0000000000000000
<36f900> JNI call-in frame: bp = 0x00000000004E33A0, sp = 0x00000000004E3380, pc = 0x00007FFFAAA85A50, cp = 0x0000000000000000, arg0EA = 0x00000000004E33A0, flags = 0x0000000000000000
<36f900>    New ELS = 0x0000000000000000
<36f900> JNI native method frame: bp = 0x00000000004E33C8, sp = 0x00000000004E33A8, pc = 0x0000000000000007, cp = 0x0000000000000000, arg0EA = 0x00000000004E33C8, flags = 0x0000000000000000
<36f900> <end of stack>
<36f900> *** END STACK WALK (rc = NONE) ***

0x004E3200 :  0000000020000001 0000000700020ae0 [ ... ............ ]
0x004E3210 :  0000000000000000 0000000000980000 [ ................ ] <--- not an object
0x004E3220 :  0000000000000000 00007fff942c07ad [ ..........,..... ]

There is bad O-slot in Bytecode frame for net/adoptopenjdk/test/hcrAgent/agent/TransformerMakerThread.run()V !j9method 0x00000000001E4F68.

The problematic address 0x4e3210 is located in the range of this thread java stack, so it might be a stack allocated object:

> !j9javastack 0x00000000004E03D0
J9JavaStack at 0x4e03d0 {
  Fields for J9JavaStack:
    0x0: U64* end = !j9x 0x00000000004E33D0 <--- starts at 0x4E2BD0, so 0x4e3210 is in range
    0x8: U64 size = 0x0000000000000800 (2048)
    0x10: class J9JavaStack* previous = !j9javastack 0x0000000000000000
    0x18: U64 firstReferenceFrame = 0x0000000000000000 (0)
}

However top frame sp = 0x4E32B8 is above problematic 0x4e3210, so value is out of actual stack range

dmitripivkine commented 1 year ago

@tajila FYI