Closed pshipton closed 10 months ago
@knn-k fyi
The location of SEGV is in SH_AttachedDataManager::SH_AttachedDataResourceDescriptor()
, trying to store updateAtOffset
to memory.
0xffff933eae60: stp x29, x30, [sp, #-32]!
0xffff933eae64: adrp x4, 0xffff9344f000
0xffff933eae68: mov x0, x1
0xffff933eae6c: mov x29, sp
0xffff933eae70: ldr x4, [x4, #3144]
0xffff933eae74: str x19, [sp, #16]
0xffff933eae78: add x19, x1, #0x8 // x1: cacheItem (arg), x19: adwInCache
0xffff933eae7c: str w2, [x19, #16] // SEGV here, x19=0xFFFF31E0D928, InaccessibleAddress=0xFFFF31E0D938, w2: updateAtOffset (arg)
0xffff933eae80: ldr x1, [x4]
0xffff933eae84: and x1, x1, #0xfffffffffffffffd
0xffff933eae88: cmp x1, #0x1
0xffff933eae8c: b.eq 0xffff933eaec0 // b.none
The address is in the middle of the memory-mapped SCC:
(gdb) info proc mappings
...
0xffff21400000 0xffff34000000 0x12c00000 0x0 /home/jenkins/.cache/javasharedresources/C290M21F2A64P_sharedcc_jenkins_G43L00
...
(gdb) x/20x 0x0FFFF31E0D920
0xffff31e0d920: 0x00000024 0x0007000b 0x00000000 0x005b95b0
0xffff31e0d930: 0x00000004 0x00010002 0xffffffff 0x07d00800 // Trying to overwrite 0xffffffff at 0xffff31e0d938
0xffff31e0d940: 0x00000000 0x00000028 0x0000004c 0x0007000b
0xffff31e0d950: 0x00000000 0x00360dd0 0x00000030 0x00000001
0xffff31e0d960: 0xffffffff 0x00601dba 0x0100180c 0x000005cf
From the javacore file:
2SCLTEXTRCS ROMClass start address = 0x0000FFFF21460000
2SCLTEXTRCE ROMClass end address = 0x0000FFFF23C61858
2SCLTEXTMSA Metadata start address = 0x0000FFFF31AAB0B0
2SCLTEXTCEA Cache end address = 0x0000FFFF34000000
No failures in two Grinder jobs for jdk_lang_1 (20 runs in total): https://openj9-jenkins.osuosl.org/job/Grinder/3078/ https://openj9-jenkins.osuosl.org/job/Grinder/3080/
The VMSTATE is 0x8000c, which means J9VMSTATE_ATTACHEDDATA_UPDATE
.
The Snap file shows the following trace:
23:24:37.730749589 0x0000ffff4c002400 j9shr.646 Entry >M reset: Resetting Manager of TYPE_ATTACHED_DATA
23:24:37.730750188 0x0000ffff4c002400 j9shr.637 Entry >M tearDownHashTable: Entering for Manager of TYPE
_ATTACHED_DATA
23:24:37.730822428 0x0000ffff4c002400 j9shr.638 Exit <M tearDownHashTable: Exit
23:24:37.730822508 0x0000ffff4c002400 j9shr.635 Entry >M initializeHashTable: Entering for Manager of TY
PE_ATTACHED_DATA
23:24:37.730822748 0x0000ffff4c002400 j9shr.562 Entry >RRM localHashTableCreate: Creating new hashtable
with 63014 initial entries
23:24:37.731030467 0x0000ffff4c002400 j9shr.563 Exit <RRM localHashTableCreate: Created hashtable at ad
dress 0xffff941518e0
23:24:37.731030667 0x0000ffff4c002400 j9shr.636 Exit <M initializeHashTable: Exiting with 0
23:24:37.731031267 0x0000ffff4c002400 j9shr.647 Exit <M reset: Done resetting Manager. Returning 0
23:24:37.731031787 0x0000ffff4c002400 j9shr.1021 Entry >CC reset: Entering
23:24:37.731032627 0x0000ffff4c002400 j9shr.1069 Event CC findStart: Setting _scan to 0xffff3280fffc
23:24:37.731034547 0x0000ffff4c002400 j9shr.1206 Entry >CC protectMetadataArea: Entering
23:24:37.731035267 0x0000ffff4c002400 j9shr.1207 Exit <CC protectMetadataArea: Exiting with rc=0
23:24:37.731035667 0x0000ffff4c002400 j9shr.1022 Exit <CC reset: Exiting
23:24:37.757386286 0x0000ffff4c002400 j9jit.12 Event JIT: Fatal Crash in the JIT while compiling java/
util/regex/Pattern$$Lambda/0x0000000000000000.is(I)Z
23:24:37.757768284 0x0000ffff4c002400 omrport.426 Entry >omrsig_protect: Entered, fn=0xffff99462d50, fn_ar
g=0xffff58de8f88, handler=0xffff99462670, handler_arg=0xffff58de8f60, flags=0x1f9
In the SH_CacheMap::updateROMClassResource()
, the shared cache is locked and the metadata area is unprotected (marked as writable) first to prepare for the following write operation. But I see:
23:24:37.731034547 0x0000ffff4c002400 j9shr.1206 Entry >CC protectMetadataArea: Entering
23:24:37.731035267 0x0000ffff4c002400 j9shr.1207 Exit <CC protectMetadataArea: Exiting with rc=0
The metadata area has been re-protected (marked as readonly), which leads to the crash on the following write.
The above 2 trace points are from: https://github.com/eclipse-openj9/openj9/blob/b014f61aacfa6ee1e32c947f8d16768556e8043c/runtime/shared_common/CompositeCache.cpp#L479
The cache should not be unlocked in this case.
https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_aarch64_linux_Nightly_testList_1/105/ jdk_lang_1 java/lang/constant/CondyDescTest.java
https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk21_j9_sanity.openjdk_aarch64_linux_Nightly_testList_1/105/openjdk_test_output.tar.gz core, javacore in jdk_lang_1\work\java\lang\constant\methodTypeDesc\ResolveConstantDesc caused by gpf.