Open hangshao0 opened 5 days ago
Issue Number: 20444 Status: Open Recommended Components: comp:vm, comp:jclextensions, comp:gc Recommended Assignees: jasonfengj9, keithc-ca, hangshao0
From the tracepoint:
11:56:46.063008409 0x0000000000d23600 j9vm.315 Entry >0Xd23600 loader 0x1ea71d24b40 class java/util/concurrent/locks/ReentrantReadWriteLock$WriteLock arbitratedLoadClass enter className
11:56:46.063030006 0x0000000000d23600 j9vm.119 Event send loadClass(java/util/concurrent/locks/ReentrantReadWriteLock$WriteLock), stringObject: 0xff802740 loader: 0x816c7ec8
11:56:46.063063703 *0x0000000000c64900 j9vm.315 Entry >0Xc64900 loader 0x1ea71d24b40 class com/ibm/ws/classloading/internal/providers/Providers arbitratedLoadClass enter className
11:56:46.063073692 *0x0000000000d23600 j9vm.120 Event sent loadClass(java/util/concurrent/locks/ReentrantReadWriteLock$WriteLock) --> got 0x80054208
11:56:46.063075972 *0x0000000000c64900 j9vm.119 Event send loadClass(com/ibm/ws/classloading/internal/providers/Providers), stringObject: 0xff81aa58 loader: 0x816c7ec8
11:56:46.063091050 0x0000000000c64900 j9vm.120 Event sent loadClass(com/ibm/ws/classloading/internal/providers/Providers) --> got 0x80054208
11:56:46.063093870 *0x0000000000d23600 j9vm.319 Exit <0Xd23600 loader 0x1ea71d24b40 class java/util/concurrent/locks/ReentrantReadWriteLock$WriteLock arbitratedLoadClass exit foundClass 0x1aa200
11:56:46.063122777 *0x0000000000c64900 j9vm.319 Exit <0Xc64900 loader 0x1ea71d24b40 class com/ibm/ws/classloading/internal/providers/Providers arbitratedLoadClass exit foundClass 0x0 <----- foundClass is 0x0 which causes the NoClassDefFoundError
11:56:46.063142825 0x0000000000c64900 j9vm.294 Entry >setCurrentException index=21 constructorIndex=0 detailMessage=0xff81ab78
There are 2 threads loading classes:
Thread 0xd23600
is loading ReentrantReadWriteLock$WriteLock
Thread 0xc64900
is loading Providers
However, both threads got 0x80054208
, which is ReentrantReadWriteLock$WriteLock
. This is correct for thread 0xd23600
, but incorrect for Thread 0xc64900
.
11:56:46.063073692 *0x0000000000d23600 j9vm.120 Event sent loadClass(java/util/concurrent/locks/ReentrantReadWriteLock$WriteLock) --> got 0x80054208
11:56:46.063091050 0x0000000000c64900 j9vm.120 Event sent loadClass(com/ibm/ws/classloading/internal/providers/Providers) --> got 0x80054208
The tracepoint is: https://github.com/eclipse-openj9/openj9/blob/a08b5362efa0bbf88a0298813240ff4723f405bf/runtime/vm/classsupport.c#L778
and sendLoadClassResult
is from:
https://github.com/eclipse-openj9/openj9/blob/a08b5362efa0bbf88a0298813240ff4723f405bf/runtime/vm/classsupport.c#L742
Could you look at why thread 0xc64900
get the same returnValue
as thread 0xd23600
? @thallium
We have also observed this strange NoSuchMethodError
where it seems to say it is looking for the getPrivateLibraries
method on ReentrantReadWriteLock$WriteLock
when that method is on the Providers
class. That sounds like it could be another flavor of this issue where the thread got the wrong class?
java.lang.NoSuchMethodError: java/util/concurrent/locks/ReentrantReadWriteLock$WriteLock.getPrivateLibraries(Lcom/ibm/wsspi/classloading/ClassLoaderConfiguration;)Ljava/util/List; (loaded from jrt:/java.base by <Bootstrap Loader>) called from class com.ibm.ws.classloading.internal.AppClassLoader (loaded from file:/C:/Users/jazz_build/Build/jazz-build-engines/wasrtc/EBCPROD/build/dev/image/output/wlp/lib/com.ibm.ws.classloading_1.1.95.jar by org.eclipse.osgi.internal.loader.EquinoxClassLoader@b19957cc[com.ibm.ws.classloading:1.1.95.202410071314(id=48)]).
at com.ibm.ws.classloading.internal.AppClassLoader.<init>(AppClassLoader.java:172)
at com.ibm.ws.classloading.internal.ClassLoaderFactory.createClassLoader(ClassLoaderFactory.java:138)
at com.ibm.ws.classloading.internal.ClassLoaderFactory.create(ClassLoaderFactory.java:116)
at com.ibm.ws.classloading.internal.ClassLoadingServiceImpl.createAppClassLoader(ClassLoadingServiceImpl.java:349)
at com.ibm.ws.classloading.internal.ClassLoadingServiceImpl.createTopLevelClassLoader(ClassLoadingServiceImpl.java:321)
at com.ibm.ws.classloading.internal.ClassLoadingServiceImpl.createTopLevelClassLoader(ClassLoadingServiceImpl.java:103)
at com.ibm.ws.classloading.ClassLoaderConfigHelper.createTopLevelClassLoader(ClassLoaderConfigHelper.java:314)
at com.ibm.ws.app.manager.module.internal.DeployedAppInfoBase.createTopLevelClassLoader(DeployedAppInfoBase.java:635)
at com.ibm.ws.app.manager.war.internal.WARDeployedAppInfo.createModuleClassLoader(WARDeployedAppInfo.java:105)
at com.ibm.ws.app.manager.module.internal.ExtendedModuleInfoImpl.getClassLoader(ExtendedModuleInfoImpl.java:124)
at com.ibm.ws.container.service.annocache.internal.ModuleAnnotationsImpl.<init>(ModuleAnnotationsImpl.java:198)
at com.ibm.ws.container.service.annocache.internal.ModuleAnnotationsImpl.<init>(ModuleAnnotationsImpl.java:161)
at com.ibm.ws.container.service.annocache.internal.WebAnnotationsImpl.<init>(WebAnnotationsImpl.java:91)
at com.ibm.ws.container.service.annocache.internal.WebAnnotationsAdapterImpl.adapt(WebAnnotationsAdapterImpl.java:63)
at com.ibm.ws.container.service.annocache.internal.WebAnnotationsAdapterImpl.adapt(WebAnnotationsAdapterImpl.java:35)
at com.ibm.ws.adaptable.module.internal.AdapterFactoryServiceImpl.adapt(AdapterFactoryServiceImpl.java:202)
at com.ibm.ws.adaptable.module.internal.AdaptableContainerImpl.adapt(AdaptableContainerImpl.java:176)
at com.ibm.ws.container.service.annocache.AnnotationsBetaHelper.getWebAnnotations(AnnotationsBetaHelper.java:270)
at com.ibm.ws.webcontainer.osgi.container.config.WebAppConfigurator.setWebAnnotations(WebAppConfigurator.java:88)
at com.ibm.ws.webcontainer.osgi.container.config.WebAppConfigurator.configure(WebAppConfigurator.java:424)
at com.ibm.ws.webcontainer.osgi.container.config.WebAppConfigurationAdapter.adapt(WebAppConfigurationAdapter.java:121)
at com.ibm.ws.webcontainer.osgi.container.config.WebAppConfigurationAdapter.adapt(WebAppConfigurationAdapter.java:45)
at com.ibm.ws.adaptable.module.internal.AdapterFactoryServiceImpl.adapt(AdapterFactoryServiceImpl.java:202)
at com.ibm.ws.adaptable.module.internal.AdaptableContainerImpl.adapt(AdaptableContainerImpl.java:176)
at com.ibm.ws.webcontainer.osgi.WebContainer.createModuleMetaData(WebContainer.java:872)
at com.ibm.ws.app.manager.module.internal.ModuleHandlerBase.createModuleMetaData(ModuleHandlerBase.java:65)
at com.ibm.ws.app.manager.module.internal.SimpleDeployedAppInfoBase$ModuleContainerInfoBase.createModuleMetaData(SimpleDeployedAppInfoBase.java:222)
at com.ibm.ws.app.manager.module.internal.SimpleDeployedAppInfoBase.preDeployApp(SimpleDeployedAppInfoBase.java:608)
at com.ibm.ws.app.manager.module.internal.SimpleDeployedAppInfoBase.installApp(SimpleDeployedAppInfoBase.java:584)
at com.ibm.ws.app.manager.module.internal.DeployedAppInfoBase.deployApp(DeployedAppInfoBase.java:625)
at com.ibm.ws.app.manager.war.internal.WARApplicationHandlerImpl.install(WARApplicationHandlerImpl.java:67)
at com.ibm.ws.app.manager.internal.statemachine.StartAction.execute(StartAction.java:199)
at com.ibm.ws.app.manager.internal.statemachine.ApplicationStateMachineImpl.enterState(ApplicationStateMachineImpl.java:1369)
at com.ibm.ws.app.manager.internal.statemachine.ApplicationStateMachineImpl.run(ApplicationStateMachineImpl.java:912)
at com.ibm.ws.threading.internal.ExecutorServiceImpl$RunnableWrapper.run(ExecutorServiceImpl.java:298)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:857)
@TobiAjila @gacholio Does this ring any bell to you ?
2 threads loading 2 different classes simultaneously. Thread1 loaded the correct class, but thread2 got the class loaded by thread1.
For Liberty we have three levels of class loaders
All three levels are storing classes into the Open J9 shared class cache. So far I only recall seeing these strange errors for the classes loaded by the Equinox bundle class loaders. This makes me suspect there is something specific to how Equinox uses the SharedClassURLHelper
that is causing something to go wrong. There we store the classes with this call:
The call to com.ibm.oti.shared.SharedClassURLHelper.storeSharedClass(String, URL, Class)
is done outside of any locks that would be held by the class loader for the class name being stored. But there proper locks while defining the class such that two threads do not try to call ClassLoader.defineClass
multiple times for the same class. And only the thread that "won" in defining the class should be the thread that attempts to store the class after it has been defined.
Similarly the method com.ibm.oti.shared.SharedClassURLHelper.findSharedClass(String, URL, String)
is used to find a shared class with this call:
Again, this call has no locks held by the class loader while calling the SharedClassURLHelper.findSharedClass
. Is it possible there is a threading issue here with either looking up or storing shared classes?
re comment
There is only one copy of com/ibm/ws/classloading/internal/providers/Providers
in the shared cache. It was added a previous JVM, not by the current JVM, which means this JVM did not call storeSharedClass()
on this class. I have dumped the com/ibm/ws/classloading/internal/providers/Providers
romClass and compared it against its java code, it looks correct to me.
Also there is only one copy of java/util/concurrent/locks/ReentrantReadWriteLock$WriteLock
. It is loaded by bootstrap class loader and it looks correct to me.
I believe this JVM called findSharedClass()
on com/ibm/ws/classloading/internal/providers/Providers
, however it is a hashtable look up where the class name is the key. I don't see how findSharedClass(String className1)
could return a class whose name is not className1
.
I believe this JVM called findSharedClass() on com/ibm/ws/classloading/internal/providers/Providers
Actually findSharedClass()
is not called on com/ibm/ws/classloading/internal/providers/Providers
by this JVM. There should be JCL: SharedClassURLHelperImpl findSharedClassImpl
entry and exit trace points in https://github.com/eclipse-openj9/openj9/issues/20444#issuecomment-2448263444 if it is called (like below), but I don't see them for com/ibm/ws/classloading/internal/providers/Providers
. So neither storeSharedClass()
nor findSharedClass()
is involved here.
11:56:46.062322947 0x0000000000c64900 j9vm.119 Event send loadClass(com/ibm/ws/classloading/internal/ContainerClassLoader$SmartClassPathImpl$1), stringObject: 0xff818ec8 loader: 0x816c7ec8
11:56:46.062344575 0x0000000000c64900 j9jcl.108 Entry >JCL: SharedClassURLHelperImpl findSharedClassImpl: Entering for helperID 49
11:56:46.062358974 0x0000000000c64900 j9jcl.539 Entry >JCL: com.ibm.oti.shared getCachedURL: Entering for helperID=49
11:56:46.062368133 0x0000000000c64900 j9jcl.154 Exit <JCL: com.ibm.oti.shared getCachedURL: Exiting with found cached URL
11:56:46.062392360 0x0000000000c64900 j9shr.1293 Event RMI locateROMClass: Locate succeeded. Returning 0x1ea6ff2ca04. FoundAtIndex=0. StaleCPEI=0x0. Locate request for ROMClass com/ibm/ws/classloading/internal/ContainerClassLoader$SmartClassPathImpl$1 from helper ID 49 with cpeIndex -1.
11:56:46.062402089 0x0000000000c64900 j9shr.1297 Event CM findROMClass: class com/ibm/ws/classloading/internal/ContainerClassLoader$SmartClassPathImpl$1 found at address 0x1ea6d70b0f8 with classpath index 0. The classpath id is 49.
11:56:46.062418908 0x0000000000c64900 j9jcl.110 Exit <JCL: SharedClassURLHelperImpl findSharedClassImpl: Exiting with TRUE
11:56:46.062545145 0x0000000000c64900 j9vm.1 Entry >Create RAM class from ROM class 0x1ea6d70b0f8 in class loader 0x1ea71d24b40
11:56:46.062553204 0x0000000000c64900 j9vm.80 Event ROM class 0x1ea6d70b0f8 is named com/ibm/ws/classloading/internal/ContainerClassLoader$SmartClassPathImpl$1
11:56:46.062564743 0x0000000000c64900 j9vm.88 Event Loaded implemented interface java/lang/Runnable (RAM class=0x65100)
11:56:46.062659994 0x0000000000c64900 j9vm.2 Exit <Created RAM class 0xd11100 from ROM class 0x1ea6d70b0f8
11:56:46.062692960 0x0000000000c64900 j9vm.120 Event sent loadClass(com/ibm/ws/classloading/internal/ContainerClassLoader$SmartClassPathImpl$1) --> got 0x8149e558
The normal sequence should be the class loader calls findSharedClass()
to see if the class is in shared cache, if not, then load the class from disk. However, in this case the wrong class is returned even before findSharedClass()
is called. I don't see the JVM creating the wrong class indicated by trace j9vm.1, j9vm2
for com/ibm/ws/classloading/internal/providers/Providers
either. I guess the wrong class could be from the cached data structure mapping between the class name and the actual class/class bytes in the EquinoxClassLoader.
I guess the wrong class could be from the cached data structure mapping between the class name and the actual class/class bytes in the EquinoxClassLoader.
Because Equinox class loaders are a graph that may contain cycles it does not hold locks for an initial call to java.lang.ClassLoader.findLoadedClass(String)
. Here is the sequence the Equinox loader uses:
ClassLoader.findLoadedClass(String)
if that returns non-null then the class is returned; otherwise continue.SharedClassURLHelper.findSharedClass
if that returns non-null then continue to step 3; otherwise the class has not been stored yet, so we do normal byte code search.ClassLoader.findLoadedClass(String)
again to make sure another thread didn't define the class; if that returns null then call ClassLoader.defineClass
with the shared class byte[].Could there be some inconsistency with allowing another thread to get a non-null result from the call to ClassLoader.findLoadedClass(String)
in step 1 while we are trying to define the shared class in step 4? For performance reasons, I would like to avoid having to get the class-loader/class-name pair lock to call that first findLoadedClass
.
I guess the wrong class could be from the cached data structure mapping between the class name and the actual class/class bytes in the EquinoxClassLoader.
I realized the wrong class returned is java/util/concurrent/locks/ReentrantReadWriteLock$WriteLock
, which is a bootstrap class. It is not a class that has just been loaded by thread 0xd23600
. It is a class that has already been loaded by bootstrap class loader before. It is very likely both threads get the class from bootstraploader.findLoadedClass(String classname)
through delegation all the way upto the bootstrap class loader. It is just thread 0xd23600
is actually looking for java/util/concurrent/locks/ReentrantReadWriteLock$WriteLock
and but thread 0xc64900
is looking for a different class. The class name might becomes wrong somewhere in between.
Could there be some inconsistency with allowing another thread to get a non-null result from the call to ClassLoader.findLoadedClass(String) in step 1 while we are trying to define the shared class in step 4? For performance reasons, I would like to avoid having to get the class-loader/class-name pair lock to call that first findLoadedClass.
It is possible that a thread in step 1 can get a not null class while another thread is defining the class in step 4 (the thread should already be near the end of step 4 after the class is already added to the hashtable keyed on the class name, the class name in step 1 and 4 should be the same).
The class name might becomes wrong somewhere in between.
I'm out of ideas on the Equinox side how that could happen. I assume that must be going wrong down in the JVM?
I'm out of ideas on the Equinox side how that could happen. I assume that must be going wrong down in the JVM?
This is a really strange issue. By looking at the code, I don't see how the JVM could be wrong about the class name either. Not sure if things go wrong inside JVM or before calling into JVM. But we can add assertions in the JVM code that the class being searched for and the class returned has the same name.
@thallium is looking at adding the assertions.
Java -version output
Summary of problem
java.lang.NoClassDefFoundError thrown when loading class com.ibm.ws.classloading.internal.providers.Providers
Diagnostic files
Diagnostic files here captured by option:
-Xdump:system+java+snap:events=throw+systhrow,filter=java/lang/NoClassDefFoundError,msg_filter=com.ibm.ws.classloading.internal.providers.Providers,request=exclusive+prepwalk+serial+preempt