eclipse-ee4j / glassfish-fighterfish

FighterFish project
Other
6 stars 11 forks source link

Deadlock in JavaEEExtender #37

Open bjetal opened 4 years ago

bjetal commented 4 years ago

When deploying and undeploying multiple OSGi bundles with Java EE functionality using Fighterfish, we sometimes get a deadlock between undeployment and deployment of the modules because undeployment locks the bundle first and then JavaEEExtender, while deployment locks JavaEEExtender and then the bundle.

The following fragment of a thread dump shows the issue:

"pool-15-thread-1" #113 prio=5 os_prio=0 tid=0x00007f147c00e800 nid=0x8a4 in Object.wait() [0x00007f14563eb000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    at java.lang.Object.wait(Object.java:502)
    at org.apache.felix.framework.Felix.acquireBundleLock(Felix.java:5039)
    - locked <0x0000000721bfb720> (a [Ljava.lang.Object;)
    at org.apache.felix.framework.Felix.startBundle(Felix.java:1866)
    at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:955)
    at org.jvnet.hk2.osgiadapter.OSGiModuleImpl.start(OSGiModuleImpl.java:210)
    - locked <0x0000000721346de0> (a org.jvnet.hk2.osgiadapter.OSGiModuleImpl)
    at org.jvnet.hk2.osgiadapter.OsgiPopulatorPostProcessor$1.loadClass(OsgiPopulatorPostProcessor.java:77)
    at org.jvnet.hk2.internal.ServiceLocatorImpl.loadClass(ServiceLocatorImpl.java:2058)
    at org.jvnet.hk2.internal.ServiceLocatorImpl.reifyDescriptor(ServiceLocatorImpl.java:413)
    at org.jvnet.hk2.internal.ServiceLocatorImpl.narrow(ServiceLocatorImpl.java:2120)
    at org.jvnet.hk2.internal.ServiceLocatorImpl.access$900(ServiceLocatorImpl.java:119)
    at org.jvnet.hk2.internal.ServiceLocatorImpl$8.compute(ServiceLocatorImpl.java:1063)
    at org.jvnet.hk2.internal.ServiceLocatorImpl$8.compute(ServiceLocatorImpl.java:1058)
    at org.glassfish.hk2.utilities.cache.LRUHybridCache$OriginThreadAwareFuture$1.call(LRUHybridCache.java:115)
    at org.glassfish.hk2.utilities.cache.LRUHybridCache$OriginThreadAwareFuture$1.call(LRUHybridCache.java:111)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at org.glassfish.hk2.utilities.cache.LRUHybridCache$OriginThreadAwareFuture.run(LRUHybridCache.java:173)
    at org.glassfish.hk2.utilities.cache.LRUHybridCache.compute(LRUHybridCache.java:292)
    at org.jvnet.hk2.internal.ServiceLocatorImpl.internalGetDescriptor(ServiceLocatorImpl.java:1147)
    at org.jvnet.hk2.internal.ServiceLocatorImpl.getServiceHandle(ServiceLocatorImpl.java:1395)
    at org.jvnet.hk2.internal.ServiceLocatorImpl.getServiceHandle(ServiceLocatorImpl.java:1384)
    at com.sun.enterprise.v3.server.ContainerStarter.startContainer(ContainerStarter.java:112)
    at com.sun.enterprise.v3.server.ApplicationLifecycle.setupContainer(ApplicationLifecycle.java:997)
    at com.sun.enterprise.v3.server.ApplicationLifecycle.setupContainerInfos(ApplicationLifecycle.java:702)
    - locked <0x000000072068a430> (a org.glassfish.internal.data.ContainerRegistry)
    at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:377)
    at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:219)
    at org.glassfish.osgijavaeebase.OSGiDeploymentRequest.deploy(OSGiDeploymentRequest.java:185)
    at org.glassfish.osgijavaeebase.OSGiDeploymentRequest.execute(OSGiDeploymentRequest.java:120)
    at org.glassfish.osgijavaeebase.AbstractOSGiDeployer.deploy(AbstractOSGiDeployer.java:123)
    at org.glassfish.osgijavaeebase.OSGiContainer.deploy(OSGiContainer.java:154)
    - locked <0x000000072ade0e90> (a org.glassfish.osgijavaeebase.OSGiContainer)
    at org.glassfish.osgijavaeebase.JavaEEExtender.deploy(JavaEEExtender.java:109)
    - locked <0x000000072ade0e70> (a org.glassfish.osgijavaeebase.JavaEEExtender)
    at org.glassfish.osgijavaeebase.JavaEEExtender.access$200(JavaEEExtender.java:61)
    at org.glassfish.osgijavaeebase.JavaEEExtender$HybridBundleTrackerCustomizer$1.call(JavaEEExtender.java:153)
    at org.glassfish.osgijavaeebase.JavaEEExtender$HybridBundleTrackerCustomizer$1.call(JavaEEExtender.java:150)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

...

"FelixFrameworkWiring" #369 daemon prio=5 os_prio=0 tid=0x00007f14682cb000 nid=0x1296 waiting for monitor entry [0x00007f1474a59000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at org.glassfish.osgijavaeebase.JavaEEExtender.undeploy(JavaEEExtender.java:122)
    - waiting to lock <0x000000072ade0e70> (a org.glassfish.osgijavaeebase.JavaEEExtender)
    at org.glassfish.osgijavaeebase.JavaEEExtender.access$600(JavaEEExtender.java:61)
    at org.glassfish.osgijavaeebase.JavaEEExtender$HybridBundleTrackerCustomizer.removedBundle(JavaEEExtender.java:206)
    at org.osgi.util.tracker.BundleTracker$Tracked.customizerRemoved(BundleTracker.java:491)
    at org.osgi.util.tracker.BundleTracker$Tracked.customizerRemoved(BundleTracker.java:414)
    at org.osgi.util.tracker.AbstractTracked.untrack(AbstractTracked.java:341)
    at org.osgi.util.tracker.BundleTracker$Tracked.bundleChanged(BundleTracker.java:449)
    at org.apache.felix.framework.util.EventDispatcher.invokeBundleListenerCallback(EventDispatcher.java:868)
    at org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:789)
    at org.apache.felix.framework.util.EventDispatcher.fireBundleEvent(EventDispatcher.java:514)
    at org.apache.felix.framework.Felix.fireBundleEvent(Felix.java:4403)
    at org.apache.felix.framework.Felix.stopBundle(Felix.java:2520)
    at org.apache.felix.framework.Felix$RefreshHelper.stop(Felix.java:4792)
    at org.apache.felix.framework.Felix.refreshPackages(Felix.java:4104)
    at org.apache.felix.framework.FrameworkWiringImpl.run(FrameworkWiringImpl.java:178)
    at java.lang.Thread.run(Thread.java:745)

...

"fileinstall-/opt/glassfish/glassfish/domains/extension/autodeploy/bundles/" #110 daemon prio=5 os_prio=0 tid=0x00007f145c8b3000 nid=0x8a1 in Object.wait() [0x00007f14566ef000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    at java.lang.Object.wait(Object.java:502)
    at org.apache.felix.framework.Felix.acquireBundleLock(Felix.java:5039)
    - locked <0x0000000721bfb720> (a [Ljava.lang.Object;)
    at org.apache.felix.framework.Felix.startBundle(Felix.java:1866)
    at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:955)
    at org.apache.felix.fileinstall.internal.DirectoryWatcher.process(DirectoryWatcher.java:1175)
    at org.apache.felix.fileinstall.internal.DirectoryWatcher.process(DirectoryWatcher.java:1153)
    at org.apache.felix.fileinstall.internal.DirectoryWatcher.processAllBundles(DirectoryWatcher.java:1146)
    at org.apache.felix.fileinstall.internal.DirectoryWatcher.process(DirectoryWatcher.java:456)
    at org.apache.felix.fileinstall.internal.DirectoryWatcher.run(DirectoryWatcher.java:263)
bjetal commented 4 years ago

We have a local fix which involves changing the concurrency management in JavaEEExtender to use a ReadWriteLock and treating deploy and undeploy as reads, with start and stop treated as writes (instead of strict monitor synchronization). This allows multiple deploys and undeploys to proceed in parallel and thus prevents the deadlock documented above.