Open pshipton opened 1 week ago
@dmitripivkine
Issue Number: 20559 Status: Open Recommended Components: comp:gc, comp:vm, comp:test Recommended Assignees: pshipton, jasonfengj9, llxia
This thread got request to execute SystemGC and trying to acquire Exclusive Access:
!stack 0x1002505b100 !j9vmthread 0x1002505b100 !j9thread 0x1002504b888 tid 0x980291 (9962129) // (load-4)
!j9vmthread 0x1002505b100 publicFlags=1020 privateFlags=0 inNative=0 // load-4
16:10:21.075646309 *0x1002505B100 j9vm.368 Event Posted Halt request to 1 threads.
16:10:21.075647719 0x1002505B100 j9vm.369 Event Waiting for 1 thread halt responses.
17:15:05.042548194 *0x100224E2A00 j9vm.243 Exit <J9SignalAsyncEvent returning 0
17:15:05.042548489 0x100224E2A00 j9vm.242 Entry >J9SignalAsyncEvent targetThread=0x10025099a00 handlerKey=2
17:15:05.042548731 0x100224E2A00 j9vm.243 Exit <J9SignalAsyncEvent returning 0
However there is another thread has set Halt flag (0x01) but still keeping VM Access (0x20) and prevents taking Exclusive:
!stack 0x10025099a00 !j9vmthread 0x10025099a00 !j9thread 0x100250898c8 tid 0x32402cf (52691663) // (load-10)
!j9vmthread 0x10025099a00 publicFlags=21 privateFlags=0 inNative=0 // load-10
Looking to this threads is dbx:
$t47 run running 52691663 u no sys
(dbx) thread current 47
(dbx) where
---------
$r0:0x090000000481148c $stkp:0x00000100259009b0 $toc:0x08001000a01ca050
$r3:0x0000010025099a00 $r4:0x000001001218ffcc $r5:0x00000100121e2ea8
$r6:0x0000000000000000 $r7:0x0000000000000000 $r8:0x0a000001d90a2778
$r9:0x00000000004d2a10 $r10:0x0900000004461ca4 $r11:0x0000010022061aa8
$r12:0x000001001218ffbc $r13:0x000001002590a800 $r14:0x0000010025aa87b8
$r15:0x0000010025099a00 $r16:0x0000010021fcb658 $r17:0xffffffffffffffff
$r18:0x0000010025900bf0 $r19:0x420000842509b678 $r20:0x09000000056b62bc
$r21:0x00000100224113e0 $r22:0x0000010025900c00 $r23:0x0000000000000001
$r24:0x09000000056c0168 $r25:0x09001000a0760b90 $r26:0x0a000000004d2a10
$r27:0x0a000001d90a2748 $r28:0x0a000001d90a2748 $r29:0x0a000000004d2a10
$r30:0x0a0000000026e860 $r31:0x0a000001d90a2778
$iar:0x090000000448d100 $msr:0x800000000000d032 $cr:0x0000000044000280
$link:0x0900000004461d60 $ctr:0x0900000004461ca4 $xer:0x0000000020000010
Condition status = 0:g 1:g 5:e 6:l
[unset $noflregs to view floating point registers]
[unset $novregs to view vector registers]
[unset $novsregs to view vector scalar registers]
0 old_slow_jitRetranslateMethod(0x10025099a00) at 0x90000000448d100
So, looks like thread in old_slow_jitRetranslateMethod()
did not drop VM Access and caused the hang.
@hzongaro FYI
I have not looked at the code, but the usual cause of issues like this is a monitor - specifically that VM access should be treated as a monitor enter for ordering purposes.
https://openj9-jenkins.osuosl.org/job/Test_openjdk8_j9_special.system_ppc64_aix_Personal_testList_2/134 - p8-java1-ibm07 LambdaLoadTest_special_J9_5m_8
-Xgcpolicy:gencon -Xshareclasses -Xjit -Xnocompressedrefs
https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk8_j9_special.system_ppc64_aix_Personal_testList_2/134/system_test_output.tar.gz
**FAILED** Process LT has timed out
There are multiple load-* threads waiting in Runtime.gc().