Closed pshipton closed 8 hours ago
@hzongaro pls take a look.
Issue Number: 20257 Status: Open Recommended Components: comp:vm, comp:gc, comp:test Recommended Assignees: pshipton, fengxue-is, gacholio
@a7ehuo, may I ask you to take a look at this? I wasn't able to build KCA, and I'm having difficulty doing much with the core file that was produced, so I'm not sure whether this is happening on the AOT load of Object.hashCode
or a compilation of it that was started after the AOT load:
05:56:46.425107386 0x1B700 j9jit.2 Event (cold) Compiling java/lang/Object.hashCode()I
05:56:46.425237837 0x1B700 j9jit.45 Event + (AOT load) java/lang/Object.hashCode()I @ 0xffff7b8077f8-0xffff7b807904/0x0-0xffff7b807904 time=191us 0x31368 0xffff797bf288 Q_SZ=4 bcsz=5 mem=0KB
05:56:46.501843072 0x1B700 omrport.256 Exception * omrfile_open returns failure, filename = /home/jenkins/workspace/Test_openjdk11_j9_sanity.system_aarch64_linux_Release_testList_1/jdkbinary/j2sdk-image/lib/java_en_US.properties, os errno = 2, portable errno = -108
05:56:46.501850134 0x1B700 omrport.256 Exception * omrfile_open returns failure, filename = /home/jenkins/workspace/Test_openjdk11_j9_sanity.system_aarch64_linux_Release_testList_1/jdkbinary/j2sdk-image/lib/java_en.properties, os errno = 2, portable errno = -108
05:56:46.504066334 0x1B700 j9dmp.9 Event Preparing for dump, filename=/home/jenkins/workspace/Test_openjdk11_j9_sanity.system_aarch64_linux_Release_testList_1/aqa-tests/TKG/output_17275821255677/PatModImg_PlatMod_0/20240929-055635-PatModImgTest/results/core.20240929.055646.3880611.0001.dmp
One thing I can see is that it looks like the crash is happening on an inlined call to getClassDepth
from jitMethodTranslated
. It looks like it might be trying to get a subclass from java/lang/Object
.
(gdb) info line *0x0000FFFF9E4E0A8C
Line 329 of "/home/jenkins/workspace/Build_JDK11_aarch64_linux_Release/openj9/runtime/oti/VMHelpers.hpp"
starts at address 0xffff9e4e0a8c <jitMethodTranslated(J9VMThread*, J9Method*, void*)+140>
and ends at 0xffff9e4e0a90 <jitMethodTranslated(J9VMThread*, J9Method*, void*)+144>.
(gdb) x/8i 0x0000FFFF9E4E0A8C
0xffff9e4e0a8c: ldrh w0, [x1, #24] <<< Crash happens here with x1=0
0xffff9e4e0a90: cmp x5, x0
0xffff9e4e0a94: b.cs 0xffff9e4e0a24 // b.hs, b.nlast
I also cannot get the backtrace from the core. So far I got more or less what Henry has found.
(gdb) bt
#0 0x0000ffff9f4cb308 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
I believe it crashes in the second call to VM_VMHelpers::getClassDepth
in jitMethodTranslated
during the do-while loop condition check because it tries to dereference classDepthAndFlags
from a NULL currentClass
.
0x0000ffff9e4e0a70 <+112>: tbz w0, #9, 0xffff9e4e0a78 <jitMethodTranslated(J9VMThread*, J9Method*, void*)+120> // CodertVMHelpers.cpp:205
0x0000ffff9e4e0a74 <+116>: ldr x1, [x22, #944] // CodertVMHelpers.cpp:206
0x0000ffff9e4e0a78 <+120>: ldurh w2, [x20, #-2]
0x0000ffff9e4e0a7c <+124>: ldrh w5, [x1, #24] // VM_VMHelpers::getClassDepth
0x0000ffff9e4e0a80 <+128>: add x20, x20, x2 // CodertVMHelpers.cpp:209: jitToJitStartAddress
0x0000ffff9e4e0a84 <+132>: b 0xffff9e4e0a98 <jitMethodTranslated(J9VMThread*, J9Method*, void*)+152> // CodertVMHelpers.cpp:209: jitToJitStartAddress
0x0000ffff9e4e0a88 <+136>: ldr x1, [x1, #176] // CodertVMHelpers.cpp:229: currentClass = currentClass->subclassTraversalLink;
0x0000ffff9e4e0a8c <+140>: ldrh w0, [x1, #24] //<--- crash VM_VMHelpers::getClassDepth
0x0000ffff9e4e0a90 <+144>: cmp x5, x0 // CodertVMHelpers.cpp:230: while (VM_VMHelpers::getClassDepth(currentClass) > initialClassDepth)
x1
is currentClass
: R1=0000000000000000
x19
is method
: R19=0000000000031368
. method
is java/lang/Object.hashCode()I
. The currentClass
starts as 0x31100
(J9_CLASS_FROM_METHOD(method)
). It then iterates through currentClass = currentClass->subclassTraversalLink
. I calculated currentClass
for the first few iterations and subclassTraversalLink
is not NULL so far.
> !j9method 0x31368
J9Method at 0x31368 {
Fields for J9Method:
0x0: U8* bytecodes = !j9x 0x0000FFFF412592AC // "*�"
0x8: struct J9ConstantPool* constantPool = !j9constantpool 0x0000000000031490 (flags = 0x4)
0x10: void* methodRunAddress = !j9x 0x0000000000000018
0x18: volatile void* extra = !j9x 0x0000FFFF7B8077F8
}
Signature: java/lang/Object.hashCode()I !bytecodes 0x0000000000031368
ROM Method: !j9rommethod 0x0000FFFF41259298
Next Method: !j9method 0x0000000000031388
> !j9constantpool 0x0000000000031490
J9ConstantPool at 0x31490 {
Fields for J9ConstantPool:
0x0: struct J9Class* ramClass = !j9class 0x0000000000031100 // java/lang/Object
0x8: struct J9ROMConstantPoolItem* romConstantPool = !j9romconstantpoolitem 0x0000FFFF412590B8
}
> !j9class 0x0000000000031100
J9Class at 0x31100 {
Fields for J9Class:
0x0: UDATA eyecatcher = 0x0000000099669966 (2573637990)
0x8: struct J9ROMClass* romClass = !j9romclass 0x0000FFFF41259000
0x10: void** superclasses = !j9x 0x0000000000031608
0x18: UDATA classDepthAndFlags = 0x00000000001E0000 (1966080)
...
0xb0: struct J9Class* subclassTraversalLink = !j9class 0x00000000000CF900 // jdk/internal/loader/URLClassPath
...
}
Class name: java/lang/Object
To view static fields, use !j9statics 0x0000000000031100
To view instance shape, use !j9classshape 0x0000000000031100
> !j9class 0x00000000000CF900
J9Class at 0xcf900 {
Fields for J9Class:
0x0: UDATA eyecatcher = 0x0000000099669966 (2573637990)
0x8: struct J9ROMClass* romClass = !j9romclass 0x0000FFFF41347A10
0x10: void** superclasses = !j9x 0x00000000000CF818
0x18: UDATA classDepthAndFlags = 0x00000000020E0001 (34471937)
...
0xb0: struct J9Class* subclassTraversalLink = !j9class 0x00000000000CCF00 // java/lang/ClassLoader$LazyInitLock
...
> !j9class 0x00000000000CCF00
J9Class at 0xccf00 {
Fields for J9Class:
0x0: UDATA eyecatcher = 0x0000000099669966 (2573637990)
0x8: struct J9ROMClass* romClass = !j9romclass 0x0000FFFF41346AD8
0x10: void** superclasses = !j9x 0x00000000000CCC88
0x18: UDATA classDepthAndFlags = 0x00000000000E0001 (917505)
...
0xb0: struct J9Class* subclassTraversalLink = !j9class 0x00000000000CCD00 // [Ljava/util/Hashtable$Entry;
...
Grinder (ID 43741) run 150x PatModImg_PlatMod_0
/PatModImg_PlatMod_1
: All passed
FYI the original failure was found on the 0.48 build.
0.48 build Grinder (ID 43777) run 150x PatModImg_PlatMod_0/PatModImg_PlatMod_1
: All passed
Eclipse OpenJ9 VM (build v0.48.0-release-6b84dcc9064, JRE 11 Linux aarch64-64-Bit Compressed References 20241002_1120 (JIT enabled, AOT enabled)
OpenJ9 - 6b84dcc9064
OMR - cfe6789bfa8
JCL - 9d684ea755d based on jdk-11.0.25+7)
I'll remove the milestone.
Should we assign to 0.49 until we are sure it's no longer an issue?
If it's seen enough I will add it back to a milestone. Until it's seen again it doesn't seem there is anything to do.
Another one, different platform. I'll put it back in 0.48 in case we can progress it from this.
https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_ppc64le_linux_Nightly_testList_0/216/ - cent9-ppc64le-6 jdk_lang_0 java/lang/invoke/MethodHandleProxies/MethodHandlesProxiesTest.java
18:20:36 Unhandled exception
18:20:36 Type=Segmentation error vmState=0x0005ffff
18:20:36 J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
18:20:36 Handler1=00007FFF89242630 Handler2=00007FFF89568820
18:20:36 R0=00007FFF8315F68C R1=00007FFF8810A540 R2=00007FFF83FAEA00 R3=0000000000000000
18:20:36 R4=00000000000ADD70 R5=00007FFF6906268C R6=0000000000000000 R7=00000000000ADAC0
18:20:36 R8=00000000000ADCB8 R9=0000000000000000 R10=0000000000000001 R11=00007FFF83FD6790
18:20:36 R12=00007FFF832AC880 R13=00007FFF881168E0 R14=00007FFF89BFD268 R15=0000000000110000
18:20:36 R16=00007FFF8810F160 R17=00007FFF83FD6158 R18=0000000000000001 R19=00007FFF83FD7810
18:20:36 R20=00007FFF8810A9C0 R21=0000000000000000 R22=00007FFF840F5710 R23=0000000000000000
18:20:36 R24=00007FFF83FD6158 R25=0000000000097D00 R26=00007FFF840EF380 R27=00007FFF8429C4F0
18:20:36 R28=00007FFF84035430 R29=0000000000097D00 R30=00007FFF69062688 R31=00000000000CA020
18:20:36 NIP=00007FFF83C702B4 MSR=800000000280D033 ORIG_GPR3=00007FFF83C702C8 CTR=0000000000000000
18:20:36 LINK=00007FFF83C70254 XER=0000000020040000 CCR=0000000026888402 SOFTE=0000000000000001
18:20:36 TRAP=0000000000000300 DAR=0000000000000018 dsisr=0000000040000000 RESULT=0000000000000000
18:20:36 FPR0=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR1=3fe8000000000000 (f: 0.000000, d: 7.500000e-01)
18:20:36 FPR2=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR3=0000002e00000021 (f: 33.000000, d: 9.761181e-313)
18:20:36 FPR4=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR5=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR6=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR7=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR8=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR9=0000000500000004 (f: 4.000000, d: 1.060998e-313)
18:20:36 FPR10=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR11=41cdcd6500000000 (f: 0.000000, d: 1.000000e+09)
18:20:36 FPR12=3f50624dd2f1a9fc (f: 3539053056.000000, d: 1.000000e-03)
18:20:36 FPR13=3980002839ce0028 (f: 969801792.000000, d: 9.861140e-32)
18:20:36 FPR14=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR15=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR16=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR17=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR18=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR19=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR20=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR21=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR22=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR23=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR24=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR25=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR26=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR27=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR28=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR29=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR30=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 FPR31=0000000000000000 (f: 0.000000, d: 0.000000e+00)
18:20:36 Module=/home/jenkins/workspace/Test_openjdk21_j9_sanity.openjdk_ppc64le_linux_Nightly_testList_0/jdkbinary/j2sdk-image/lib/default/libj9jit29.so
18:20:36 Module_base_address=00007FFF83000000
18:20:36
18:20:36 Method_being_compiled=java/util/Collection.stream()Ljava/util/stream/Stream;
18:20:36 Target=2_90_20241003_237 (Linux 5.14.0-503.el9.ppc64le)
18:20:36 CPU=ppc64le (4 logical CPUs) (0x1dade0000 RAM)
18:20:36 ----------- Stack Backtrace -----------
18:20:36 jitMethodTranslated+0xb4 (0x00007FFF83C702B4 [libj9jit29.so+0xc702b4])
18:20:36 _ZN2TR15CompilationInfo14compilationEndEP10J9VMThreadRNS_24IlGeneratorMethodDetailsEP11J9JITConfigPvS7_P11TR_FrontEndP21TR_MethodToBeCompiledPNS_11CompilationE+0x38c (0x00007FFF8315F68C [libj9jit29.so+0x15f68c])
18:20:36 _ZN2TR28CompilationInfoPerThreadBase20postCompilationTasksEP10J9VMThreadP21TR_MethodToBeCompiledP8J9MethodPKvP19J9JITExceptionTablebbP20TR_RelocationRuntime+0x360 (0x00007FFF83171FB0 [libj9jit29.so+0x171fb0])
18:20:36 _ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadP21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x3b0 (0x00007FFF83174430 [libj9jit29.so+0x174430])
18:20:36 _ZN2TR24CompilationInfoPerThread12processEntryER21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x170 (0x00007FFF831749F0 [libj9jit29.so+0x1749f0])
18:20:36 _ZN2TR24CompilationInfoPerThread14processEntriesEv+0x410 (0x00007FFF831731E0 [libj9jit29.so+0x1731e0])
18:20:36 _ZN2TR24CompilationInfoPerThread3runEv+0xa8 (0x00007FFF831737F8 [libj9jit29.so+0x1737f8])
18:20:36 _Z30protectedCompilationThreadProcP13J9PortLibraryPN2TR24CompilationInfoPerThreadE+0xa0 (0x00007FFF831738B0 [libj9jit29.so+0x1738b0])
18:20:36 omrsig_protect+0x3e4 (0x00007FFF89569CD4 [libj9prt29.so+0x39cd4])
18:20:36 _Z21compilationThreadProcPv+0x1a8 (0x00007FFF83173E18 [libj9jit29.so+0x173e18])
18:20:36 thread_wrapper+0x190 (0x00007FFF8999CBC0 [libj9thr29.so+0xcbc0])
18:20:36 start_thread+0x170 (0x00007FFF89CA2C30 [libc.so.6+0xa2c30])
18:20:36 clone+0xa0 (0x00007FFF89D4C6C8 [libc.so.6+0x14c6c8])
18:20:36 ---------------------------------------
I looked at the core from ppc64le. There is no backtrace [1].
This crash looks same as the previous one. It crashes because it dereferences a NULL currentClass
(R9=0000000000000000
) that is retrieved from subclassTraversalLink
[2].
https://github.com/eclipse-openj9/openj9/blob/1153568b78f6fa2ebbb7c929259172c0813c5bba/runtime/codert_vm/CodertVMHelpers.cpp#L229-L230
Same as previous crash, jitMethodTranslated
is called at the end of an AOT load method. However this time, it is a different method java/util/Collection.stream
:
+ (AOT load) java/util/Collection.stream()Ljava/util/stream/Stream; @ 0x7fff69062688-0x7fff6906284c/0x0-0x7fff6906284c time=21us 0xca020 0x7fff62913e78 Q_SZ=91 bcsz=11 mem=0KB
(gdb) list *0x00007FFF83C702B4
0x7fff83c702b4 is in jitMethodTranslated(J9VMThread*, J9Method*, void*) (/home/jenkins/workspace/Build_JDK21_ppc64le_linux_Nightly/openj9/runtime/oti/VMHelpers.hpp:329).
324 in /home/jenkins/workspace/Build_JDK21_ppc64le_linux_Nightly/openj9/runtime/oti/VMHelpers.hpp
(gdb) list *0x00007FFF8315F68C
0x7fff8315f68c is in TR::CompilationInfo::compilationEnd(J9VMThread*, TR::IlGeneratorMethodDetails&, J9JITConfig*, void*, void*, TR_FrontEnd*, TR_MethodToBeCompiled*, TR::Compilation*) (/home/jenkins/workspace/Build_JDK21_ppc64le_linux_Nightly/openj9/runtime/compiler/control/CompilationThread.cpp:10586).
warning: Source file is more recent than executable.
10581 entry->_compInfoPT->setMetadata(NULL);
10582 startPC = oldStartPC;
10583 }
10584 else
10585 {
10586 jitMethodTranslated(vmThread, method, startPC);
10587 #if defined(J9VM_OPT_CRIU_SUPPORT)
10588 if (jitConfig->javaVM->internalVMFunctions->isCheckpointAllowed(jitConfig->javaVM)
...
Based on the logic inside jitMethodTranslated
, I calculated the first six iterations of the loop. subclassTraversalLink
is all good so far. I suspect either classDepthAndFlags
or subclassTraversalLink
or both are not set up correctly somewhere along the class chain and it continues to iterates through the subclasses when it is not supposed to.
With my limited knowledge, I don't think the JIT modifies subclassTraversalLink
or classDepthAndFlags
of a J9Class. @TobiAjila I'm wondering if the VM team could help take a look at it further. Or if there is anything I could check to further the investigation. It'd be much appreciated. Thank you!
iteration 0 [3]
currentClass = 0x000adb00 // java/lang/Object
initialClassDepth = 0
currentClass
initially is 0x00000000000CA100
(java/util/Collection
). Since java/util/Collection
is an interface, currentClass
is reassigned as 0x000adb00
(java/lang/Object
)
iteration 1 [4]
currentClass = 0x0000000000243F00 // sun/nio/fs/AbstractBasicFileAttributeView
iteration 2 [5]
currentClass = 0x0000000000244600 // sun/nio/fs/UnixFileAttribjitStartAddressuteViews$Basic
iteration 3 [6]
currentClass = 0x0000000000244600 // sun/nio/fs/UnixFileAttributeViews$Basic
iteration 4 [7]
currentClass = 0x0000000000243600 // sun/nio/fs/UnixFileAttributeViews
iteration 5 [8]
currentClass = 0x0000000000243200 // java/nio/file/attribute/UserDefinedFileAttributeView
[1]
(gdb) bt
#0 0x00007fff89ca5668 in ?? ()
#1 0x0000000000000000 in ?? ()
[2]
0x00007fff83c70278 <+120>: beq 0x7fff83c70350 <jitMethodTranslated(J9VMThread*, J9Method*, void*)+336>
0x00007fff83c7027c <+124>: ld r9,8(r31) // 204: currentClass = J9_CLASS_FROM_METHOD(method);
0x00007fff83c70280 <+128>: rldicr r9,r9,0,59 // 204: currentClass = J9_CLASS_FROM_METHOD(method)
0x00007fff83c70284 <+132>: ld r9,0(r9) // 204: currentClass = J9_CLASS_FROM_METHOD(method)
0x00007fff83c70288 <+136>: ld r10,8(r9) // 205: if (J9ROMCLASS_IS_INTERFACE(currentClass->romClass))
0x00007fff83c7028c <+140>: lwz r10,16(r10) // 205: if (J9ROMCLASS_IS_INTERFACE(currentClass->romClass))
0x00007fff83c70290 <+144>: andi. r10,r10,512 // 205: if (J9ROMCLASS_IS_INTERFACE(currentClass->romClass))
0x00007fff83c70294 <+148>: beq 0x7fff83c7029c <jitMethodTranslated(J9VMThread*, J9Method*, void*)+156> // 205: if (J9ROMCLASS_IS_INTERFACE(currentClass->romClass))
0x00007fff83c70298 <+152>: ld r9,944(r28) // 206: currentClass = J9VMJAVALANGOBJECT_OR_NULL(vm);
0x00007fff83c7029c <+156>: lhz r10,-2(r30) // getClassDepth
0x00007fff83c702a0 <+160>: lhz r3,24(r9) // getClassDepth
0x00007fff83c702a4 <+164>: add r5,r30,r10 // jitToJitStartAddress
0x00007fff83c702a8 <+168>: b 0x7fff83c702c0 <jitMethodTranslated(J9VMThread*, J9Method*, void*)+192> // jitToJitStartAddress
0x00007fff83c702ac <+172>: ori r2,r2,0 // jitToJitStartAddress
0x00007fff83c702b0 <+176>: ld r9,176(r9) // 229: currentClass = currentClass->subclassTraversalLink;
0x00007fff83c702b4 <+180>: lhz r10,24(r9) //<-- crash
0x00007fff83c702b8 <+184>: cmpld r3,r10
[3]
> !j9method 0x00000000000CA020 |grep constantPool
0x8: struct J9ConstantPool* constantPool = !j9constantpool 0x00000000000CA2A0 (flags = 0x4)
> !j9constantpool 0x00000000000CA2A0 | grep ramClass
0x0: struct J9Class* ramClass = !j9class 0x00000000000CA100 // java/util/Collection
> !j9class 0x00000000000CA100 | grep romClass
0x8: struct J9ROMClass* romClass = !j9romclass 0x00007FFF312948F8
> !j9romclass 0x00007FFF312948F8 | grep modifiers
0x10: U32 modifiers = 0x00000601 (1537)
currentClass = J9VMJAVALANGOBJECT_OR_NULL(vm);
!j9javavm 0x00007FFF84035430
0x00007FFF84035430 + 944
(gdb) x 0x7FFF840357E0
0x7fff840357e0: 0x000adb00
currentClass = 0x000adb00
> !j9class 0x000adb00 | grep classDepthAndFlags
0x18: UDATA classDepthAndFlags = 0x00000000001E0000 (1966080)
[4]
> !j9class 0x000adb00 | grep subclassTraversalLink
0xb0: struct J9Class* subclassTraversalLink = !j9class 0x0000000000243F00 // sun/nio/fs/AbstractBasicFileAttributeView
> !j9class 0x0000000000243F00 | grep classDepthAndFlags
0x18: UDATA classDepthAndFlags = 0x00000000001E0001 (1966081)
[5]
> !j9class 0x0000000000243F00 | grep subclassTraversalLink
0xb0: struct J9Class* subclassTraversalLink = !j9class 0x0000000000244600 // sun/nio/fs/UnixFileAttributeViews$Basic
> !j9class 0x0000000000244600 | grep classDepthAndFlags
0x18: UDATA classDepthAndFlags = 0x00000000020E0002 (34471938)
[6]
> !j9class 0x0000000000244600 | grep subclassTraversalLink
0xb0: struct J9Class* subclassTraversalLink = !j9class 0x0000000000243900 // sun/nio/fs/DynamicFileAttributeView
> !j9class 0x0000000000243900 | grep classDepthAndFlags
0x18: UDATA classDepthAndFlags = 0x00000000001E0001 (1966081)
[7]
> !j9class 0x0000000000243900 | grep subclassTraversalLink
0xb0: struct J9Class* subclassTraversalLink = !j9class 0x0000000000243600 // sun/nio/fs/UnixFileAttributeViews
> !j9class 0x0000000000243600 | grep classDepthAndFlags
0x18: UDATA classDepthAndFlags = 0x00000000000E0001 (917505)
[8]
> !j9class 0x0000000000243600 | grep subclassTraversalLink
0xb0: struct J9Class* subclassTraversalLink = !j9class 0x0000000000243200 // java/nio/file/attribute/UserDefinedFileAttributeView
> !j9class 0x0000000000243200 | grep classDepthAndFlags
0x18: UDATA classDepthAndFlags = 0x00000000000E0001 (917505)
The only thing that comes to mind is there could be an issue if a class was being loaded while that method is called.
FYI @gacholio
If the thread does not have exclusive, there certainly could be new classes loaded during the loop.
If the thread does not have exclusive ...
Just trying to understand, do you mean something like acquireVMAccessNoSuspend before calling jitMethodTranslated
?
inline void acquireVMAccessNoSuspend(J9VMThread *vmThread)
{
vmThread->javaVM->internalVMFunctions->internalAcquireVMAccessWithMask(vmThread, J9_PUBLIC_FLAGS_HALT_THREAD_ANY_NO_JAVA_SUSPEND);
}
We may need to add a mutex for the traversal link uses. Exclusive and VM access act like a mutex, but it would be pretty heavy-handed to use that in this case.
Looks like updates are protected with classTableMutex
so the readers should use that as well.
jitMethodTranslated is called in two places in TR::CompilationInfo::compilationEnd. Its implementation and where they are called have not been changed for a few years.
@mpirvu I'm wondering if you see any concerns of calling classTableMutex
inside jitMethodTranslated
at the end of compilation?
(1) https://github.com/eclipse-openj9/openj9/blob/8357d17869fedf6b29018582bd88fe7b0c63ad31/runtime/compiler/control/CompilationThread.cpp#L10468 (2) https://github.com/eclipse-openj9/openj9/blob/8357d17869fedf6b29018582bd88fe7b0c63ad31/runtime/compiler/control/CompilationThread.cpp#L10586
Marius is on vacation. Please ignore my last comment.
@gacholio I have a change that wraps the loop inside jitMethodTranslated
with classTableMutex
.
https://github.com/eclipse-openj9/openj9/compare/master...a7ehuo:openj9:acquire-classTableMutex
Does it look correct? I don't think the code before the loop needs acquiring the mutex.
I looked at how other parts of VM traverse subclass link such as in verifyJ9ClassSubclassHierarchy or initializeClassLinks. It doesn't look obvious to me that they acquire classTableMutex. Another question is why it would be required in jitMethodTranslated?
It should probably be acquired in all of the reader cases you've found.
After taking a further look at verifyJ9ClassSubclassHierarchy and initializeClassLinks. I think these two should be fine that they don't need to acquire classTableMutex
. The change in jitMethodTranslated
is all needed for now and I'll run some regression test on the change.
(1) verifyJ9ClassSubclassHierarchy
verifyJ9ClassSubclassHierarchy
is triggered through runAllVMChecks
which states "Caller must either have exclusive VM access or must be executing on behalf of a thread that does"
initializeClassLinks
is called inside internalCreateRAMClassDone
which already acquires classTableMutex
. And there is another version of no mutex internalCreateRAMClassDoneNoMutex
.
Your change looks correct to me.
Issue Number: 20257 Status: Closed Actual Components: comp:jit, test failure, segfault Actual Assignees: No one :( PR Assignees: No one :(
Issue Number: 20257 Status: Closed Actual Components: comp:jit, test failure, segfault Actual Assignees: No one :( PR Assignees: No one :(
Re the failure https://github.com/eclipse-openj9/openj9/issues/20257#issuecomment-2393867132 100x grinder across 5 machines passed.
https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.system_aarch64_linux_Release_testList_1/110 PatModImg_PlatMod_0
vmState [0x5ffff]: {J9VMSTATE_JIT} {Illegal optimization number}
https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk11_j9_sanity.system_aarch64_linux_Release_testList_1/110/system_test_output.tar.gz