Closed pshipton closed 1 year ago
https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/26/ jdk_lang_1 java/lang/runtime/ReferencedKeyTest.java
23:58:10 java.lang.RuntimeException: missing key
23:58:10 at java.base/java.lang.runtime.ReferencedKeyTest.assertTrue(ReferencedKeyTest.java:52)
23:58:10 at java.base/java.lang.runtime.ReferencedKeyTest.methods(ReferencedKeyTest.java:66)
23:58:10 at java.base/java.lang.runtime.ReferencedKeyTest.mapTest(ReferencedKeyTest.java:62)
23:58:10 at java.base/java.lang.runtime.ReferencedKeyTest.main(ReferencedKeyTest.java:44)
5 x 100x grinder (stops on failure) https://openj9-jenkins.osuosl.org/job/Grinder/2646/ Failed in every child. # iterations to failure: 21, 39, 83, 10, 20
again with -Xint https://openj9-jenkins.osuosl.org/job/Grinder/2650/ Failed in every child. #iterations to failure: 9, 21, 31, 21, 13
again on xmac with -Xint https://openj9-jenkins.osuosl.org/job/Grinder/2652/ - passed other linux platforms with -Xint https://openj9-jenkins.osuosl.org/job/Grinder/2654/ - passed alinux, plinux, xlinux (ignoring infra failure)
@TobiAjila fyi
Try to get javacore, system core on failure.
-Xdump:java+system:events=throw,filter=java/lang/RuntimeException,request=exclusive+prepwalk+preempt
1 machine https://openj9-jenkins.osuosl.org/job/Grinder/2659/ - passed
3 machines https://openj9-jenkins.osuosl.org/job/Grinder/2660/ - reproduced the failure in https://openj9-jenkins.osuosl.org/job/Grinder_iteration_2/298/ iteration 89
Artifacts with javacore, core https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder_iteration_2/298/openjdk_test_output.tar.gz
The RefenceKeyMap has twice as many entries as it should have.
> !j9object 0x000000002C6513C0
!J9Object 0x000000002C6513C0 {
struct J9Class* clazz = !j9class 0x24E800 // java/lang/runtime/ReferencedKeyMap
Object flags = 0x00000000;
I lockword = 0x00000000 (offset = 0) (java/lang/Object) <hidden>
Z isSoft = 0x00000000 (offset = 12) (java/lang/runtime/ReferencedKeyMap)
Ljava/util/Map; map = !fj9object 0x2c6513d8 (offset = 4) (java/lang/runtime/ReferencedKeyMap)
Ljava/lang/ref/ReferenceQueue; stale = !fj9object 0x2c651400 (offset = 8) (java/lang/runtime/ReferencedKeyMap)
}
> !fj9object 0x2c6513d8
!J9Object 0x000000002C6513D8 {
struct J9Class* clazz = !j9class 0xCF000 // java/util/HashMap
Object flags = 0x00000000;
I lockword = 0x00000000 (offset = 0) (java/lang/Object) <hidden>
Ljava/util/Set; keySet = !fj9object 0x0 (offset = 4) (java/util/AbstractMap)
Ljava/util/Collection; values = !fj9object 0x0 (offset = 8) (java/util/AbstractMap)
[Ljava/util/HashMap$Node; table = !fj9object 0x2c650ac0 (offset = 12) (java/util/HashMap)
Ljava/util/Set; entrySet = !fj9object 0x0 (offset = 16) (java/util/HashMap)
I size = 0x00000034 (offset = 20) (java/util/HashMap)
I modCount = 0x00000034 (offset = 24) (java/util/HashMap) //52 but should be 26
I threshold = 0x00000060 (offset = 28) (java/util/HashMap)
F loadFactor = 0x3F400000 (offset = 32) (java/util/HashMap)
}
The API relies on the ReferenceQueue; stale
ref queue to clean up stale entries. The queue is empty but somehow there are still stale entries.
public void removeStaleReferences() {
while (true) {
Object key = stale.poll();
if (key == null) {
break;
}
map.remove(key);
}
}
In a successful run 52 items must be in the refQueue (stale)
The test relies on
System.gc();
sleep();
To force a GC and add objects to refqueues, which is not guaranteed on J9.
@dmitripivkine are objects added to associated reference queues on every GC?
yes, but Local GC might address references from Nursery only.
Okay so we would need
System.gc();
System.gc();
to trigger a global GC?
yes, I think so.
As this is apparently a test problem (limitation of OpenJ9?) rather than a functional problem, removing the blocker label.
This seems to be a similar problem to https://github.com/eclipse-openj9/openj9/issues/17723
I noticed
> !j9javavm 0x000003FF9801EB20 | grep processReference
0x16e8: struct J9ThreadMonitor* processReferenceMonitor = !j9threadmonitor 0x000003FF98009CE8
0x16f0: UDATA processReferenceActive = 0x0000000000000000 (0)
Probably explains why adding more GCs doesnt solve it
@dmitripivkine Actually the vm->processReferenceActive
represents the number of references active being processed. It doesnt represent the references pending processing.
I think we need finalizeListManager->getReferenceCount()
to understand this situation better
> !gc_finalizelistmanager 0x000003FF9807E410
GC_FinalizeListManager at 0x3ff9807e410 {
Fields for MM_Base:
Fields for MM_BaseVirtual:
0x8: const U8* _typeId = !j9x 0x000003FF9DA422CC // "GC_FinalizeListManager"
Fields for GC_FinalizeListManager:
0x10: class MM_GCExtensions* _extensions = !mm_gcextensions 0x000003FF98036B80
0x18: struct J9ThreadMonitor* _mutex = !j9threadmonitor 0x000003FF98055AC8
0x20: struct J9Object* _systemFinalizableObjects = !j9object 0x0000000000000000
0x28: UDATA _systemFinalizableObjectCount = 0x0000000000000000 (0)
0x30: struct J9Object* _defaultFinalizableObjects = !j9object 0x0000000000000000
0x38: UDATA _defaultFinalizableObjectCount = 0x0000000000000000 (0)
0x40: struct J9Object* _referenceObjects = !j9object 0x000000002C651340 // jdk/internal/ref/CleanerImpl$PhantomCleanableRef
0x48: UDATA _referenceObjectCount = 0x00000000000000A6 (166) //<--this is the real value
0x50: struct J9ClassLoader* _classLoaders = !j9classloader 0x0000000000000000
0x58: UDATA _classLoaderCount = 0x0000000000000000 (0)
}
After taking a brief look at static int J9THREAD_PROC FinalizeMainThread(void *javaVM)
it seems like we only consume one job at a time. I think we need a way to force the finalizer thread to empty the queue (or a certain amount).
After taking a brief look at
static int J9THREAD_PROC FinalizeMainThread(void *javaVM)
it seems like we only consume one job at a time. I think we need a way to force the finalizer thread to empty the queue (or a certain amount).
There is runFinalization()
implementation expected to run Finalization non-stop until queue is empty: https://github.com/eclipse-openj9/openj9/blob/6256ba4ad8e9ea5087fe1c18093b6f0fa3a16ad6/runtime/gc_base/FinalizerSupport.cpp#L899
Is it kind of service you are looking?
Is it kind of service you are looking?
Yes, it fixed the issue I described above. Ill see if it can be used here, https://github.com/eclipse-openj9/openj9/issues/17723.
If we dont see this issue again in the next couple of days we can close it now that the System.gc() changes are in.
The System.gc() changes are https://github.com/eclipse-openj9/openj9/pull/18044
Not seen lately, closing for now.
https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_0/24/ jdk_lang_0, jdk_lang_j9_0 java/lang/runtime/ReferencedKeyTest.java