Open 1a2s3d4f1 opened 3 years ago
@dmitripivkine @0xdaryl fyi.
Another balanced gc problem: Java program:HMCL Set -Xmx1024M -Xgcpolicy:balanced gc log:verbosegc.20210206.185105.2000.zip
Set -Xmx1024M -Xgcpolicy:balanced -Xgc:regionSize=16M gc log:verbosegc.20210206.185035.4012.zip
@BeverlyXu : could you work with @1a2s3d4f1 and @dmitripivkine to provide JIT expertise to triage this? The JIT piggybacks on the AIOB mechanism when dealing with discontiguous arrays so there may be a JIT issue here.
@1a2s3d4f1 Thank you for reporting this issue.
We need to figure out which method/object is having an array out of bound issue. Could you help take a core dump using something like this on the top method from the backtrace:
-Xdump:system+java:events=throw,filter=java/lang/ArrayIndexOutOfBoundsException#grondag/canvas/terrain/occlusion/geometry/Area.keyToIndex
If you have more than one exception, you could add this to only generate one core ,range=1..1
.
More on the format to generate dump on exceptions, please see here: https://www.eclipse.org/openj9/tools/xdump_option_builder.html
I am not sure how GC might be involved here. Looks like there are two threads with similar stacks and exceptions. From Snap traces:
03:42:40.148188314 *0x40A3E00 j9vm.294 Entry >setCurrentException index=22 constructorIndex=0 detailMessage=0x0
03:42:40.148196604 0x40A3E00 j9vm.10 Entry >internalSendExceptionConstructor
03:42:40.148216344 0x40A3E00 j9vm.11 Exit <internalSendExceptionConstructor
03:42:40.148230161 0x40A3E00 j9vm.301 Exit <setCurrentException
03:42:40.153794709 *0x36AEB00 j9vm.294 Entry >setCurrentException index=22 constructorIndex=0 detailMessage=0x0
03:42:40.153802605 0x36AEB00 j9vm.10 Entry >internalSendExceptionConstructor
03:42:40.153819976 0x36AEB00 j9vm.11 Exit <internalSendExceptionConstructor
03:42:40.153831819 0x36AEB00 j9vm.301 Exit <setCurrentException
Both threads have similar java stacks, so there is top of one of them:
> !stackslots 0x040a3e00
<40a3e00> *** BEGIN STACK WALK, flags = 00400001 walkThread = 0x00000000040A3E00 ***
<40a3e00> ITERATE_O_SLOTS
<40a3e00> RECORD_BYTECODE_PC_OFFSET
<40a3e00> Initial values: walkSP = 0x00000000047C77F8, PC = 0x0000000000000005, literals = 0x0000000000000000, A0 = 0x00000000047C7818, j2iFrame = 0x00000000047C7958, ELS = 0x000000002FA4F760, decomp = 0x0000000000000000
<40a3e00> JIT resolve frame: bp = 0x00000000047C7818, sp = 0x00000000047C77F8, pc = 0x0000000000000005, cp = 0x0000000000000000, arg0EA = 0x00000000047C7818, flags = 0x0000000000080000
<40a3e00> Object push (savedJITException)
<40a3e00> O-Slot[0x00000000047C77F8] = 0x0000000000000000
<40a3e00> Generic resolve
<40a3e00> unwindSP initialized to 0x00000000047C7820
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F530] = U64(0x00000000FFFFFFF0) (jit_rax)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F538] = U64(0x0000000000000002) (jit_rbx)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F540] = U64(0x00000000043E0E90) (jit_rcx)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F548] = U64(0x000000002FA4F470) (jit_rdx)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F550] = U64(0x0000000000000000) (jit_rdi)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F558] = U64(0x00000000FFFFFFF0) (jit_rsi)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F560] = U64(0x00000000029EF600) (jit_rbp)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F568] = U64(0x0000000000004010) (jit_rsp)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F570] = U64(0x0000000000000001) (jit_r8)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F578] = U64(0x0000000000000005) (jit_r9)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F580] = U64(0x0000000000000000) (jit_r10)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F588] = U64(0x0000000000000001) (jit_r11)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F590] = U64(0x0000000000000004) (jit_r12)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F598] = U64(0x0000000000000008) (jit_r13)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F5A0] = U64(0x0000000000000006) (jit_r14)
<40a3e00> JIT-Resolve-RegisterMap[0x000000002FA4F5A8] = U64(0x0000000000000000) (jit_r15)
<40a3e00> JIT frame: bp = 0x00000000047C7828, pc = 0x000007FECD549031, unwindSP = 0x00000000047C7820, cp = 0x00000000043E0ED0, arg0EA = 0x00000000047C7830, jitInfo = 0x000000004AFF53E8
<40a3e00> Method: grondag/canvas/terrain/occlusion/geometry/Area.keyToIndex(I)I !j9method 0x00000000043E12B8
<40a3e00> Bytecode index = 4, inlineDepth = 0, PC offset = 0x0000000000000019
<40a3e00> stackMap=0x000000004AFF54E1, slots=I16(0x0001) parmBaseOffset=I16(0x0010), parmSlots=U16(0x0000), localBaseOffset=I16(0x0000)
<40a3e00> Described JIT temps starting at 0x00000000047C7828 for IDATA(0x0000000000000000) slots
<40a3e00> JIT-RegisterMap = UDATA(0x0000000000000000)
<40a3e00> JIT-Frame-RegisterMap[0x000000002FA4F538] = U64(0x0000000000000002) (jit_rbx)
<40a3e00> JIT-Frame-RegisterMap[0x000000002FA4F578] = U64(0x0000000000000005) (jit_r9)
<40a3e00> JIT-Frame-RegisterMap[0x000000002FA4F580] = U64(0x0000000000000000) (jit_r10)
<40a3e00> JIT-Frame-RegisterMap[0x000000002FA4F588] = U64(0x0000000000000001) (jit_r11)
<40a3e00> JIT-Frame-RegisterMap[0x000000002FA4F590] = U64(0x0000000000000004) (jit_r12)
<40a3e00> JIT-Frame-RegisterMap[0x000000002FA4F598] = U64(0x0000000000000008) (jit_r13)
<40a3e00> JIT-Frame-RegisterMap[0x000000002FA4F5A0] = U64(0x0000000000000006) (jit_r14)
<40a3e00> JIT-Frame-RegisterMap[0x000000002FA4F5A8] = U64(0x0000000000000000) (jit_r15)
<40a3e00> I2J values: PC = 0x0000000036CDB462, A0 = 0x00000000047C7888, walkSP = 0x00000000047C7838, literals = 0x00000000043E0E90, JIT PC = 0x000007FEE0A93C60, pcAddress = 0x000000002FA4F788, decomp = 0x0000000000000000
<40a3e00> Bytecode frame: bp = 0x00000000047C7848, sp = 0x00000000047C7838, pc = 0x0000000036CDB462, cp = 0x00000000043E0BD0, arg0EA = 0x00000000047C7888, flags = 0x0000000000000000
<40a3e00> Method: grondag/canvas/terrain/occlusion/geometry/AreaFinder.find([JILit/unimi/dsi/fastutil/ints/IntConsumer;)V !j9method 0x00000000043E0E90
<40a3e00> Bytecode index = 62
<40a3e00> Using local mapper
<40a3e00> Locals starting at 0x00000000047C7888 for 0x0000000000000008 slots
<40a3e00> I-Slot: a0[0x00000000047C7888] = 0x00000000D9CF6858
<40a3e00> I-Slot: a1[0x00000000047C7880] = 0x00000000D9CF6868
<40a3e00> I-Slot: a2[0x00000000047C7878] = 0x0000000000000024
<40a3e00> O-Slot: a3[0x00000000047C7870] = 0x00000000D8AA4220
<40a3e00> O-Slot: t4[0x00000000047C7868] = 0x00000000D8AA4E20
<40a3e00> I-Slot: t5[0x00000000047C7860] = 0x000007FE0000001A
<40a3e00> I-Slot: t6[0x00000000047C7858] = 0x000007FEFFFFFFF0
<40a3e00> I-Slot: t7[0x00000000047C7850] = 0x000007FE0000479D
<40a3e00> Bytecode frame: bp = 0x00000000047C78A0, sp = 0x00000000047C7890, pc = 0x0000000036CDA332, cp = 0x00000000043DFEB0, arg0EA = 0x00000000047C78D0, flags = 0x0000000000000000
<40a3e00> Method: grondag/canvas/terrain/occlusion/geometry/BoxFinder.markBoxSlices()V !j9method 0x00000000043E04B8
<40a3e00> Bytecode index = 54
<40a3e00> Using local mapper
<40a3e00> Locals starting at 0x00000000047C78D0 for 0x0000000000000006 slots
<40a3e00> O-Slot: a0[0x00000000047C78D0] = 0x00000000D9CF3750
<40a3e00> O-Slot: t1[0x00000000047C78C8] = 0x00000000D9CF6868
<40a3e00> O-Slot: t2[0x00000000047C78C0] = 0x00000000D9CF6858
<40a3e00> I-Slot: t3[0x00000000047C78B8] = 0x00000000D8A80000
<40a3e00> I-Slot: t4[0x00000000047C78B0] = 0x0000000000000024
<40a3e00> I-Slot: t5[0x00000000047C78A8] = 0x0000000000000009
......
@BeverlyXu @tajila I am not sure what I can check here from GC point of view - everything looks clean. Can somebody investigate what is bad index is, what is array this index for etc. ? There are a few discontiguous arrays in the system but I can not see connection from this stack. I have files downloaded, ask me where. Please let me know if you need help from GC side.
@BeverlyXu : are there any insights you can share from a compiled code perspective on this problem? Perhaps start by finding the object that triggered the AIOB and the site where the AIOB was thrown so we can determine if this is a problem with the spine check at that point.
I'm working on this issue right now. Will post any findings soon.
Moving out to 0.27.
@jdmpapin : could you pick up the investigation on this for the 0.28 release please?
https://mega.nz/folder/jAsDBK4Y#V2Oq4-BGF2PEyjrinElSpQ I Uploaded new dump files(jit dump and core dump).
Java version:
openjdk version "11.0.13" 2021-10-19
IBM Semeru Runtime Open Edition 11.0.13.0 (build 11.0.13+8)
Eclipse OpenJ9 VM 11.0.13.0 (build openj9-0.29.0, JRE 11 Windows 7 amd64-64-Bit Compressed References 20211022_218 (JIT enabled, AOT enabled)
OpenJ9 - e1e72c497
OMR - 299b6a2d2
JCL - 2d83aa3b76 based on jdk-11.0.13+8)
I prepare a Linux virtual machine. Enable Double Mapping can not fix this AIBO problem(Double Mapping can fix HMCL rendering problem). This problem can happend with metronome gc.
Java -version output
Summary of problem
Canvas is a minecraft render mod. The chunk render was broke and crashed when using balanced gc(-Xgcpolicy:balanced)
The chunk render is normal when I using gencon gc. If set -Xmx2048M or set -Xgc:regionSize=1M ,The chunk can render. Maybe it is arraylets problem.
Diagnostic files
crash report
Area.java#L132 AreaFinder.java#L187
gc log(default region size): verbosegc.20210205.211006.2804.zip gc log(1MB region size): verbosegc.20210205.211552.3260.zip
In game screenshot: