Closed JasonFengJ9 closed 1 year ago
50x internal grinder - 3/100 failed
test/jdk/java/lang/ClassLoader/loadLibraryUnload/LoadLibraryUnload.java
[1]
// release strong refs
clazz = null;
threads = null;
exceptions.clear();
ForceGC gc = new ForceGC();
This test expects GC unloads the class and associated native library after clear strong refs. I think OpenJ9 GC has different behaviours unless @dmitripivkine says otherwise.
Trying to reproduce with GC verbose log (got different issues but not this one) Meanwhile two comments:
-Xalwaysclassgc
. There is no guarantee Global GC is going to perform class unloading. An alternative solution might be run two SystemGCs back to back. Every second SystemGC is treated as "aggressive" and performed class unloading 2/300 failed w/ two System.gc()
back to back
09:28:32 Native library loaded.
09:28:32 Native library loaded from Class1.
09:28:32 Native library loaded from Class1.
09:28:32 ForceGC condition met after System.gc() 2 times
09:28:32 Native library loaded in 1 threads
09:28:32 STDERR:
09:28:32 JVMJNCK001I JNI check utility installed. Use -Xcheck:jni:help for usage
09:28:32 java.lang.RuntimeException: Failed to unload native library: expected true, was false
09:28:32 at jdk.test.lib.Asserts.fail(Asserts.java:594)
09:28:32 at jdk.test.lib.Asserts.assertTrue(Asserts.java:486)
09:28:32 at LoadLibraryUnloadTest.main(LoadLibraryUnloadTest.java:102)
09:28:32 at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
09:28:32 at java.base/java.lang.reflect.Method.invoke(Method.java:578)
09:28:32 at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
09:28:32 at java.base/java.lang.Thread.run(Thread.java:1573)
09:28:32
09:28:32 JavaTest Message: Test threw exception: java.lang.RuntimeException: Failed to unload native library: expected true, was false
FYI @dmitripivkine
@JasonFengJ9 Is it possible to generate system core on this exception? It might help to figure out what is going on
@dmitripivkine the diagnostic files are in result
I don't see any evidence System GC has been called, even once. There is no reference in the snap traces or Java core
There is last (local} GC:
3STHSTTYPE 14:56:50:067560021 GMT j9mm.64 - LocalGC start: globalcount=0 scavengecount=6 weakrefs=0 soft=0 phantom=0 finalizers=0
As you see zero Global GCs, zero unprocessed References
I think the extra command line option didn't make it into the actual running process caused RuntimeException
, have to add it into the test code.
@dmitripivkine hope this result is better.
@JasonFengJ9 Do you know which set of files I should look at? There are many cores/javacores in this result file. I have looked to couple of them, both have class unloading performed
Do you know which set of files I should look at?
No, not sure which core file(s) containing p.Class1
unloading logs.
From core generated on test failure ./aqa-tests/TKG/output_16699217315972/jdk_custom_0/work/classes/java/lang/ClassLoader/loadLibraryUnload/LoadLibraryUnloadTest.d/core.20221201.190902.1210802.0001.dmp
Classloader LoadLibraryUnload$TestLoader
is not unloaded:
+-------------------------------------------------------------------------
| ClassLoader | SharedLibraries | ClassHashTable | jniIDs Pool |
| | Pool | | |
| |-----------------+-----------------+-----------------|
| Address | used | capacity | used | capacity | used | capacity |
+-------------------+-----------------+-----------------+-----------------
0x000003FF7C0A4A68 4 35 880 960 248 256 // jdk/internal/loader/ClassLoaders$BootClassLoader
0x000003FF7C0A4B70 0 0 2 17 0 0 // jdk/internal/loader/ClassLoaders$PlatformClassLoader
0x000003FF7C0A4C78 0 0 0 17 0 0 // java/lang/InternalAnonymousClassLoader
0x000003FF7C0A4D80 0 0 68 73 12 16 // jdk/internal/loader/ClassLoaders$AppClassLoader
0x000003FF7C0A5098 0 0 9 17 2 16 // LoadLibraryUnload$TestLoader <----
This !j9classloader 0x3FF7C0A5098
can not be unloaded because of alive !j9object 0x8B5225D0
instance of !j9class 0x18CCA00 // p/Class1
:
> !j9object 0x000000008B5225D0
!J9Object 0x000000008B5225D0 {
struct J9Class* clazz = !j9class 0x18CCA00 // p/Class1
Object flags = 0x00000020;
I lockword = 0x00000000 (offset = 0) (java/lang/Object) <hidden>
}
The root keeps object alive is Ljava/lang/Thread
object:
java/lang/Thread@0x000000008B521C18 (from "holder")
java/lang/Thread$FieldHolder@0x000000008B5221C0 (from "task")
LoadLibraryUnload$LoadLibraryFromClass@0x000000008B5224A0 (from "object")
p/Class1@0x000000008B5225D0
> !j9object 0x000000008B521C18
!J9Object 0x000000008B521C18 {
struct J9Class* clazz = !j9class 0x179F500 // java/lang/Thread
Object flags = 0x0000002A;
I lockword = 0x00000000 (offset = 0) (java/lang/Object) <hidden>
J eetop = 0x0000000000000000 (offset = 4) (java/lang/Thread)
J tid = 0x000000000000001A (offset = 12) (java/lang/Thread)
Ljava/lang/String; name = !fj9object 0x8b5221a8 (offset = 36) (java/lang/Thread)
Z interrupted = 0x00000000 (offset = 92) (java/lang/Thread)
Ljava/lang/ClassLoader; contextClassLoader = !fj9object 0xdfb0058 (offset = 40) (java/lang/Thread)
Ljava/security/AccessControlContext; inheritedAccessControlContext = !fj9object 0x0 (offset = 44) (java/lang/Thread)
Ljava/lang/Thread$FieldHolder; holder = !fj9object 0x8b5221c0 (offset = 48) (java/lang/Thread) <-----
Ljava/lang/ThreadLocal$ThreadLocalMap; threadLocals = !fj9object 0x0 (offset = 52) (java/lang/Thread)
Ljava/lang/ThreadLocal$ThreadLocalMap; inheritableThreadLocals = !fj9object 0x0 (offset = 56) (java/lang/Thread)
Ljava/lang/Object; extentLocalBindings = !fj9object 0x0 (offset = 60) (java/lang/Thread)
Ljava/lang/Object; interruptLock = !fj9object 0x8b5221e8 (offset = 64) (java/lang/Thread)
Ljava/lang/Object; parkBlocker = !fj9object 0x0 (offset = 68) (java/lang/Thread)
Lsun/nio/ch/Interruptible; nioBlocker = !fj9object 0x0 (offset = 72) (java/lang/Thread)
Ljdk/internal/vm/Continuation; cont = !fj9object 0x0 (offset = 76) (java/lang/Thread)
Ljava/lang/Thread$UncaughtExceptionHandler; uncaughtExceptionHandler = !fj9object 0x0 (offset = 80) (java/lang/Thread)
J threadLocalRandomSeed = 0x0000000000000000 (offset = 20) (java/lang/Thread)
I threadLocalRandomProbe = 0x00000000 (offset = 96) (java/lang/Thread)
I threadLocalRandomSecondarySeed = 0x00000000 (offset = 100) (java/lang/Thread)
Ljdk/internal/vm/ThreadContainer; container = !fj9object 0x0 (offset = 84) (java/lang/Thread)
Ljdk/internal/vm/StackableScope; headStackableScopes = !fj9object 0x0 (offset = 88) (java/lang/Thread)
Z started = 0x00000001 (offset = 104) (java/lang/Thread)
Z stopCalled = 0x00000000 (offset = 108) (java/lang/Thread)
J tls = 0x0000000000000000 (offset = 28) (java/lang/Thread) <hidden>
}
> !fj9object 0x8b5221c0
!J9Object 0x000000008B5221C0 {
struct J9Class* clazz = !j9class 0x17B7200 // java/lang/Thread$FieldHolder
Object flags = 0x00000020;
I lockword = 0x00000000 (offset = 0) (java/lang/Object) <hidden>
Ljava/lang/ThreadGroup; group = !fj9object 0xdf9d6f8 (offset = 12) (java/lang/Thread$FieldHolder)
Ljava/lang/Runnable; task = !fj9object 0x8b5224a0 (offset = 16) (java/lang/Thread$FieldHolder) <-----
J stackSize = 0x0000000000000000 (offset = 4) (java/lang/Thread$FieldHolder)
I priority = 0x00000005 (offset = 20) (java/lang/Thread$FieldHolder)
Z daemon = 0x00000000 (offset = 24) (java/lang/Thread$FieldHolder)
I threadStatus = 0x00000000 (offset = 28) (java/lang/Thread$FieldHolder)
Z stillborn = 0x00000000 (offset = 32) (java/lang/Thread$FieldHolder)
}
> !fj9object 0x8b5224a0
!J9Object 0x000000008B5224A0 {
struct J9Class* clazz = !j9class 0x18DEE00 // LoadLibraryUnload$LoadLibraryFromClass
Object flags = 0x00000020;
I lockword = 0x00000000 (offset = 0) (java/lang/Object) <hidden>
Ljava/lang/Object; object = !fj9object 0x8b5225d0 (offset = 4) (LoadLibraryUnload$LoadLibraryFromClass) <----
Ljava/lang/reflect/Method; method = !fj9object 0x8b5225e0 (offset = 8) (LoadLibraryUnload$LoadLibraryFromClass)
}
@JasonFengJ9 Would you please investigate why thread object keeps object in question alive?
The thread appears alive at https://github.com/eclipse-openj9/openj9/issues/16337#issuecomment-1339990330 0x000000008B521C18
has Ljava/lang/String; name = !fj9object 0x8b5221a8 (offset = 36) (java/lang/Thread)
!fj9object 0x8b5221a8
J9VMJavaLangString at 0x000000008B5221A8 {
struct J9Class* clazz = !j9class 0x1774E00 // java/lang/String
Object flags = 0x00000020;
[B value = !fj9object 0x8b522480 (offset = 0) (java/lang/String)
B coder = 0x00000001 (offset = 4) (java/lang/String)
I hash = 0x00000000 (offset = 8) (java/lang/String)
Z hashIsZero = 0x00000000 (offset = 12) (java/lang/String)
"Thread-7". <---- thread name
Current threads:
!stack 0x0174fd00 !j9vmthread 0x0174fd00 !j9thread 0x3ff7c01ead0 tid 0x1279b5 (1210805) // (JIT Compilation Thread-000)
!stack 0x01752300 !j9vmthread 0x01752300 !j9thread 0x3ff7c0e3140 tid 0x1279b6 (1210806) // (JIT Compilation Thread-001 Suspended)
!stack 0x01754900 !j9vmthread 0x01754900 !j9thread 0x3ff7c0e3638 tid 0x1279b7 (1210807) // (JIT Compilation Thread-002 Suspended)
!stack 0x01756f00 !j9vmthread 0x01756f00 !j9thread 0x3ff7c0e3b30 tid 0x1279b8 (1210808) // (JIT Compilation Thread-003 Suspended)
!stack 0x01759500 !j9vmthread 0x01759500 !j9thread 0x3ff7c0e7520 tid 0x1279b9 (1210809) // (JIT Compilation Thread-004 Suspended)
!stack 0x0175bb00 !j9vmthread 0x0175bb00 !j9thread 0x3ff7c0e7a18 tid 0x1279ba (1210810) // (JIT Compilation Thread-005 Suspended)
!stack 0x0175e100 !j9vmthread 0x0175e100 !j9thread 0x3ff7c0e7f10 tid 0x1279bb (1210811) // (JIT Compilation Thread-006 Suspended)
!stack 0x01760600 !j9vmthread 0x01760600 !j9thread 0x3ff7c0e8cf0 tid 0x1279bc (1210812) // (JIT Diagnostic Compilation Thread-007 Suspended)
!stack 0x01762c00 !j9vmthread 0x01762c00 !j9thread 0x3ff7c0e91e8 tid 0x1279be (1210814) // (JIT-SamplerThread)
!stack 0x01765200 !j9vmthread 0x01765200 !j9thread 0x3ff7c0e96e0 tid 0x1279bf (1210815) // (IProfiler)
!stack 0x0186e200 !j9vmthread 0x0186e200 !j9thread 0x3ff7c17b840 tid 0x1279c0 (1210816) // (Common-Cleaner)
!stack 0x01888800 !j9vmthread 0x01888800 !j9thread 0x3ff7c17c230 tid 0x1279c2 (1210818) // (Concurrent Mark Helper)
!stack 0x0188ae00 !j9vmthread 0x0188ae00 !j9thread 0x3ff7c2296e0 tid 0x1279c3 (1210819) // (GC Worker)
!stack 0x0188d400 !j9vmthread 0x0188d400 !j9thread 0x3ff7c229bd8 tid 0x1279c4 (1210820) // (GC Worker)
!stack 0x0188fa00 !j9vmthread 0x0188fa00 !j9thread 0x3ff7c22a0d0 tid 0x1279c5 (1210821) // (GC Worker)
!stack 0x018ccf00 !j9vmthread 0x018ccf00 !j9thread 0x3ff7c164c30 tid 0x1279ca (1210826) // (Finalizer thread)
!stack 0x01961400 !j9vmthread 0x01961400 !j9thread 0x3ff7c2901f8 tid 0x1279d5 (1210837) // (Cleaner-0)
!stack 0x01951600 !j9vmthread 0x01951600 !j9thread 0x3ff7c01e0e0 tid 0x1279b3 (1210803) // (DestroyJavaVM helper thread)
I think Thread-7
was one of threads created [1] and added to threads
[2] which was set to null
[3]. There is no strong root for those threads afterwards, and the test expects they and the resources associated are GCed, i.e, unloading the native libraries.
@dmitripivkine this seems a GC behaviour test, do we expect those thread objects[1] are GCed?
[1] https://github.com/ibmruntimes/openj9-openjdk-jdk19/blob/84975bef19dceb77173965f3778dea1d98dbe731/test/jdk/java/lang/ClassLoader/loadLibraryUnload/LoadLibraryUnload.java#L114-L115 [2] https://github.com/ibmruntimes/openj9-openjdk-jdk19/blob/84975bef19dceb77173965f3778dea1d98dbe731/test/jdk/java/lang/ClassLoader/loadLibraryUnload/LoadLibraryUnload.java#L108 [3] https://github.com/ibmruntimes/openj9-openjdk-jdk19/blob/84975bef19dceb77173965f3778dea1d98dbe731/test/jdk/java/lang/ClassLoader/loadLibraryUnload/LoadLibraryUnload.java#L153
Yes, if Java thread is not attached to JVM any more and there is no references from other objects which might keep thread object alive thread object should be collected (and should not prevent this class loader unloading eventually)
If I assume problematic thread name is Thread-7
for any run and looking for activity for this thread in Snap traces:
Thread-7
I was looking to this problem for a few days, there is my conclusion:
Thread-7
in this case) thread object FieldHolder
keeps class instance alive. There is output for other threads. However reviewing number of successful cases Snap Traces output does not look consistent. I can not tell was this thread not active or it's trace information lost somehow.vmstop
event. System core has been generated for each iteration of the test.vmstop
seems is too late. I think core generation at Global GC might provide better view to the problem. However it means even larger number of the cores will be generated. GC verbose log can be useful too. If we have tracepoints to clarify timing for threads activity an explicit activation of them might be useful.@tajila @JasonFengJ9 FYI
@babsingh please investigate this
600x grinder (LoadLibraryUnloadTest): https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/30392.
The failure is not reproducible. An earlier PR probably resolved this failure. Thus, closing this issue. The issue can be reopened if the failure reoccurs.
Appeared at yesterday JDK19 s390 internal build(rhel7s390x-3-10
)
openjdk version "19.0.1-beta" 2022-10-18
IBM Semeru Runtime Open Edition 19.0.1+10-202301180213 (build 19.0.1-beta+10-202301180213)
Eclipse OpenJ9 VM 19.0.1+10-202301180213 (build master-04ee39006, JRE 19 Linux s390x-64-Bit Compressed References 20230117_24 (JIT enabled, AOT enabled)
OpenJ9 - 04ee39006
OMR - 333650df1
JCL - 40656c4641 based on jdk-19.0.1+10)
[2023-01-18T07:58:07.340Z] variation: -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage Mode650
[2023-01-18T07:58:07.340Z] JVM_OPTIONS: -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -XX:-UseCompressedOops
[2023-01-18T07:59:56.068Z] TEST: java/lang/ClassLoader/loadLibraryUnload/LoadLibraryUnloadTest.java
[2023-01-18T07:59:56.069Z] STDERR:
[2023-01-18T07:59:56.069Z] JVMJNCK001I JNI check utility installed. Use -Xcheck:jni:help for usage
[2023-01-18T07:59:56.069Z] java.lang.RuntimeException: Failed to unload native library: expected true, was false
[2023-01-18T07:59:56.069Z] at jdk.test.lib.Asserts.fail(Asserts.java:594)
[2023-01-18T07:59:56.069Z] at jdk.test.lib.Asserts.assertTrue(Asserts.java:486)
[2023-01-18T07:59:56.069Z] at LoadLibraryUnloadTest.main(LoadLibraryUnloadTest.java:102)
[2023-01-18T07:59:56.069Z] at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
[2023-01-18T07:59:56.069Z] at java.base/java.lang.reflect.Method.invoke(Method.java:578)
[2023-01-18T07:59:56.069Z] at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
[2023-01-18T07:59:56.069Z] at java.base/java.lang.Thread.run(Thread.java:1573)
[2023-01-18T08:11:04.334Z] jdk_lang_1_FAILED
FYI @babsingh
The personal branch used for dump generation - https://github.com/JasonFengJ9/openj9-openjdk-jdk19/tree/systemgc
Launched 300x grinder w/ additional test code [1]
...
WeakReference<List<Thread>> wThreads = new WeakReference<>(threads);
// release strong refs
clazz = null;
threads = null;
exceptions.clear();
ForceGC gc = new ForceGC();
...
if (!gc.await(() -> wThreads.refersTo(null))) {
throw new RuntimeException("threads hasn't been GC'ed");
}
JDK19 LoadLibraryUnloadTest
commit history: https://github.com/ibmruntimes/openj9-openjdk-jdk19/commits/openj9/test/jdk/java/lang/ClassLoader/loadLibraryUnload/LoadLibraryUnloadTest.java
JDK-next LoadLibraryUnloadTest
commit history: https://github.com/ibmruntimes/openj9-openjdk-jdk/commits/openj9/test/jdk/java/lang/ClassLoader/loadLibraryUnload/LoadLibraryUnloadTest.java
In JDK-next, there are two more commits for LoadLibraryUnloadTest
, which do not exist in JDK19:
Both commits address intermittent failures associated with the testcase. The failure seen in this Github issue is addressed by Commit 1. Since this fix is missing from JDK19, we should disable the testcase for JDK19.
https://github.com/adoptium/aqa-tests/pull/4264 has been created to disable the test for OpenJ9 JDK19.
It seems we can just backport the fix(es) and run the test rather than excluding it.
It seems we can just backport the fix(es) and run the test rather than excluding it.
Sure, will backport the fix instead of disabling the test.
Failure link
From an internal build(
rhel7s390x-4-10
)Rerun in Grinder - Change TARGET to run only the failed test targets.
Optional info
Failure output (captured from console output)
50x internal grinder
It was reported at https://github.com/eclipse-openj9/openj9/issues/14441#issuecomment-1306097963 & https://github.com/eclipse-openj9/openj9/issues/14441#issuecomment-1180878022 however https://github.com/eclipse-openj9/openj9/issues/14441 was
java/lang/ClassLoader/nativeLibrary/NativeLibraryTest
for JDK17 AIX. Will triage.