WrenSecurity / wrenam

Community fork of OpenAM, an authentication and authorization system originally developed by ForgeRock.
Other
43 stars 27 forks source link

Unable to shutdown AM in a clean way #63

Closed pavelhoral closed 1 year ago

pavelhoral commented 1 year ago

When I shut down docker image (#61) it hangs indefinitely due to the following thread:

"Thread-1" #33 prio=5 os_prio=0 cpu=19.86ms elapsed=90.87s tid=0x00007f03740209a0 nid=0x265 waiting on condition  [0x00007f03685fc000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at jdk.internal.misc.Unsafe.park(java.base@17.0.5/Native Method)
        - parking to wait for  <0x00000000bba8ec00> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.parkNanos(java.base@17.0.5/LockSupport.java:252)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(java.base@17.0.5/AbstractQueuedSynchronizer.java:1672)
        at java.util.concurrent.ArrayBlockingQueue.poll(java.base@17.0.5/ArrayBlockingQueue.java:435)
        at org.forgerock.openam.cts.impl.queue.AsyncResultHandler.getResults(AsyncResultHandler.java:86)
        at org.forgerock.openam.cts.monitoring.impl.queue.TokenMonitoringResultHandler.getResults(TokenMonitoringResultHandler.java:50)
        at org.forgerock.openam.cts.monitoring.impl.queue.TokenMonitoringResultHandler.getResults(TokenMonitoringResultHandler.java:27)
        at org.forgerock.openam.cts.CTSPersistentStoreImpl.update(CTSPersistentStoreImpl.java:131)
        at org.forgerock.openam.cts.CTSPersistentStoreImpl.update(CTSPersistentStoreImpl.java:124)
        at org.forgerock.openam.session.service.access.persistence.SessionPersistenceStore.save(SessionPersistenceStore.java:121)
        at org.forgerock.openam.session.service.access.persistence.InternalSessionPersistenceStore.store(InternalSessionPersistenceStore.java:61)
        at org.forgerock.openam.session.service.access.persistence.InternalSessionStoreChain$ChainIterator.store(InternalSessionStoreChain.java:99)
        at org.forgerock.openam.session.service.access.persistence.SessionPersistenceManagerStep.store(SessionPersistenceManagerStep.java:60)
        at org.forgerock.openam.session.service.access.persistence.InternalSessionStoreChain$ChainIterator.store(InternalSessionStoreChain.java:97)
        at org.forgerock.openam.session.service.access.persistence.caching.InMemoryInternalSessionCacheStep.store(InMemoryInternalSessionCacheStep.java:125)
        at org.forgerock.openam.session.service.access.persistence.InternalSessionStoreChain$ChainIterator.store(InternalSessionStoreChain.java:97)
        at org.forgerock.openam.session.service.access.persistence.AbstractInternalSessionStoreStep.store(AbstractInternalSessionStoreStep.java:46)
        at org.forgerock.openam.session.service.access.persistence.InternalSessionStoreChain$ChainIterator.store(InternalSessionStoreChain.java:97)
        at org.forgerock.openam.session.service.access.persistence.InternalSessionStoreChain.store(InternalSessionStoreChain.java:55)
        at org.forgerock.openam.session.service.access.persistence.SessionPersistenceManagerStep.notifyUpdate(SessionPersistenceManagerStep.java:74)
        at com.iplanet.dpro.session.service.InternalSession.notifyPersistenceManager(InternalSession.java:1112)
        at com.iplanet.dpro.session.service.InternalSession.setLatestAccessTime(InternalSession.java:830)
        at com.iplanet.dpro.session.operations.strategies.LocalOperations.getSessionInfo(LocalOperations.java:197)
        at com.iplanet.dpro.session.operations.strategies.LocalOperations.refresh(LocalOperations.java:112)
        at com.iplanet.dpro.session.monitoring.MonitoredOperations.refresh(MonitoredOperations.java:67)
        at com.iplanet.dpro.session.Session.doRefresh(Session.java:763)
        at com.iplanet.dpro.session.Session.access$000(Session.java:84)
        at com.iplanet.dpro.session.Session$1.run(Session.java:739)
        at com.sun.identity.session.util.RestrictedTokenContext.doUsing(RestrictedTokenContext.java:81)
        at com.iplanet.dpro.session.Session.refresh(Session.java:736)
        at org.forgerock.openam.session.SessionCache.getSession(SessionCache.java:253)
        at org.forgerock.openam.session.SessionCache.getSession(SessionCache.java:203)
        at org.forgerock.openam.session.SessionCache.getSession(SessionCache.java:188)
        at com.iplanet.sso.providers.dpro.SSOProviderImpl.destroyToken(SSOProviderImpl.java:343)
        at com.iplanet.sso.SSOTokenManager.destroyToken(SSOTokenManager.java:490)
        at com.sun.identity.security.AdminTokenAction.resetInstance(AdminTokenAction.java:182)
        at com.sun.identity.security.AdminTokenAction.reset(AdminTokenAction.java:176)
        at com.sun.identity.security.AdminTokenAction$1.shutdown(AdminTokenAction.java:131)
        at com.sun.identity.common.ShutdownManager.shutdown(ShutdownManager.java:218)
        at com.sun.identity.common.ShutdownServletContextListener.contextDestroyed(ShutdownServletContextListener.java:51)
        at org.apache.catalina.core.StandardContext.listenerStop(StandardContext.java:4812)
        at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5463)
        - locked <0x00000000803f1ce8> (a org.apache.catalina.core.StandardContext)
        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:257)
        - locked <0x00000000803f1ce8> (a org.apache.catalina.core.StandardContext)
        at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1409)
        at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1398)
        at java.util.concurrent.FutureTask.run(java.base@17.0.5/FutureTask.java:264)
        at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75)
        at java.util.concurrent.AbstractExecutorService.submit(java.base@17.0.5/AbstractExecutorService.java:145)
        at org.apache.catalina.core.ContainerBase.stopInternal(ContainerBase.java:983)
        - locked <0x00000000800d8040> (a org.apache.catalina.core.StandardHost)
        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:257)
        - locked <0x00000000800d8040> (a org.apache.catalina.core.StandardHost)
        at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1409)
        at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1398)
        at java.util.concurrent.FutureTask.run(java.base@17.0.5/FutureTask.java:264)
        at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75)
        at java.util.concurrent.AbstractExecutorService.submit(java.base@17.0.5/AbstractExecutorService.java:145)
        at org.apache.catalina.core.ContainerBase.stopInternal(ContainerBase.java:983)
        - locked <0x00000000800d7ee8> (a org.apache.catalina.core.StandardEngine)
        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:257)
        - locked <0x00000000800d7ee8> (a org.apache.catalina.core.StandardEngine)
        at org.apache.catalina.core.StandardService.stopInternal(StandardService.java:495)
        - locked <0x00000000800d7ee8> (a org.apache.catalina.core.StandardEngine)
        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:257)
        - locked <0x00000000800c57f8> (a org.apache.catalina.core.StandardService)
        at org.apache.catalina.core.StandardServer.stopInternal(StandardServer.java:982)
        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:257)
        - locked <0x00000000801b0250> (a org.apache.catalina.core.StandardServer)
        at org.apache.catalina.startup.Catalina.stop(Catalina.java:849)
        at org.apache.catalina.startup.Catalina$CatalinaShutdownHook.run(Catalina.java:996)

I have not invested any time in investigating the root cause yet.

pavelhoral commented 1 year ago

This might be issue only for embedded DS (i.e. not a production issue). Seems it is caused by asynchronous processing of the shutdown hook - admin token is being destroyed after CTS store (embedded DS) connection processing is closed.

pavelhoral commented 1 year ago

Just tested with external DS instance and the race condition with CTS is still present:

Stack trace of the blocking thread during shutdown ``` "main" #1 prio=5 os_prio=0 cpu=16139,09ms elapsed=364,27s tid=0x00007f4a44017640 nid=0xda90 waiting on condition [0x00007f4a4bdfc000] java.lang.Thread.State: TIMED_WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base@17.0.6/Native Method) - parking to wait for <0x00000006f112afe0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(java.base@17.0.6/LockSupport.java:252) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(java.base@17.0.6/AbstractQueuedSynchronizer.java:1672) at java.util.concurrent.ArrayBlockingQueue.poll(java.base@17.0.6/ArrayBlockingQueue.java:435) at org.forgerock.openam.cts.impl.queue.AsyncResultHandler.getResults(AsyncResultHandler.java:85) at org.forgerock.openam.cts.monitoring.impl.queue.TokenMonitoringResultHandler.getResults(TokenMonitoringResultHandler.java:50) at org.forgerock.openam.cts.monitoring.impl.queue.TokenMonitoringResultHandler.getResults(TokenMonitoringResultHandler.java:27) at org.forgerock.openam.cts.CTSPersistentStoreImpl.update(CTSPersistentStoreImpl.java:131) at org.forgerock.openam.cts.CTSPersistentStoreImpl.update(CTSPersistentStoreImpl.java:124) at org.forgerock.openam.session.service.access.persistence.SessionPersistenceStore.save(SessionPersistenceStore.java:121) at org.forgerock.openam.session.service.access.persistence.InternalSessionPersistenceStore.store(InternalSessionPersistenceStore.java:61) at org.forgerock.openam.session.service.access.persistence.InternalSessionStoreChain$ChainIterator.store(InternalSessionStoreChain.java:99) at org.forgerock.openam.session.service.access.persistence.SessionPersistenceManagerStep.store(SessionPersistenceManagerStep.java:60) at org.forgerock.openam.session.service.access.persistence.InternalSessionStoreChain$ChainIterator.store(InternalSessionStoreChain.java:97) at org.forgerock.openam.session.service.access.persistence.caching.InMemoryInternalSessionCacheStep.store(InMemoryInternalSessionCacheStep.java:125) at org.forgerock.openam.session.service.access.persistence.InternalSessionStoreChain$ChainIterator.store(InternalSessionStoreChain.java:97) at org.forgerock.openam.session.service.access.persistence.AbstractInternalSessionStoreStep.store(AbstractInternalSessionStoreStep.java:46) at org.forgerock.openam.session.service.access.persistence.InternalSessionStoreChain$ChainIterator.store(InternalSessionStoreChain.java:97) at org.forgerock.openam.session.service.access.persistence.InternalSessionStoreChain.store(InternalSessionStoreChain.java:55) at org.forgerock.openam.session.service.access.persistence.SessionPersistenceManagerStep.notifyUpdate(SessionPersistenceManagerStep.java:74) at com.iplanet.dpro.session.service.InternalSession.notifyPersistenceManager(InternalSession.java:1130) at com.iplanet.dpro.session.service.InternalSession.setLatestAccessTime(InternalSession.java:848) at com.iplanet.dpro.session.operations.strategies.LocalOperations.getSessionInfo(LocalOperations.java:197) at com.iplanet.dpro.session.operations.strategies.LocalOperations.refresh(LocalOperations.java:112) at com.iplanet.dpro.session.monitoring.MonitoredOperations.refresh(MonitoredOperations.java:67) at com.iplanet.dpro.session.Session.doRefresh(Session.java:765) at com.iplanet.dpro.session.Session.access$000(Session.java:84) at com.iplanet.dpro.session.Session$1.run(Session.java:741) at com.sun.identity.session.util.RestrictedTokenContext.doUsing(RestrictedTokenContext.java:81) at com.iplanet.dpro.session.Session.refresh(Session.java:737) at org.forgerock.openam.session.SessionCache.getSession(SessionCache.java:253) at org.forgerock.openam.session.SessionCache.getSession(SessionCache.java:203) at org.forgerock.openam.session.SessionCache.getSession(SessionCache.java:188) at com.iplanet.sso.providers.dpro.SSOProviderImpl.destroyToken(SSOProviderImpl.java:343) at com.iplanet.sso.SSOTokenManager.destroyToken(SSOTokenManager.java:490) at com.sun.identity.security.AdminTokenAction.resetInstance(AdminTokenAction.java:182) at com.sun.identity.security.AdminTokenAction.reset(AdminTokenAction.java:176) at com.sun.identity.security.AdminTokenAction$1.shutdown(AdminTokenAction.java:131) at com.sun.identity.common.ShutdownManager.shutdown(ShutdownManager.java:218) at com.sun.identity.common.ShutdownServletContextListener.contextDestroyed(ShutdownServletContextListener.java:51) at org.apache.catalina.core.StandardContext.listenerStop(StandardContext.java:4762) at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5433) - locked <0x00000006892f4650> (a org.apache.catalina.core.StandardContext) at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:257) - locked <0x00000006892f4650> (a org.apache.catalina.core.StandardContext) at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1400) at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1389) at java.util.concurrent.FutureTask.run(java.base@17.0.6/FutureTask.java:264) at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75) at java.util.concurrent.AbstractExecutorService.submit(java.base@17.0.6/AbstractExecutorService.java:145) at org.apache.catalina.core.ContainerBase.stopInternal(ContainerBase.java:976) - locked <0x00000006892eccb0> (a org.apache.catalina.core.StandardHost) at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:257) - locked <0x00000006892eccb0> (a org.apache.catalina.core.StandardHost) at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1400) at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1389) at java.util.concurrent.FutureTask.run(java.base@17.0.6/FutureTask.java:264) at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75) at java.util.concurrent.AbstractExecutorService.submit(java.base@17.0.6/AbstractExecutorService.java:145) at org.apache.catalina.core.ContainerBase.stopInternal(ContainerBase.java:976) - locked <0x00000006892f5790> (a org.apache.catalina.core.StandardEngine) at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:257) - locked <0x00000006892f5790> (a org.apache.catalina.core.StandardEngine) at org.apache.catalina.core.StandardService.stopInternal(StandardService.java:498) - locked <0x00000006892f5790> (a org.apache.catalina.core.StandardEngine) at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:257) - locked <0x00000006890fa148> (a org.apache.catalina.core.StandardService) at org.apache.catalina.core.StandardServer.stopInternal(StandardServer.java:982) at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:257) - locked <0x00000006890cd048> (a org.apache.catalina.core.StandardServer) at org.apache.catalina.startup.Catalina.stop(Catalina.java:849) at org.apache.catalina.startup.Catalina.start(Catalina.java:811) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@17.0.6/Native Method) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@17.0.6/NativeMethodAccessorImpl.java:77) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@17.0.6/DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(java.base@17.0.6/Method.java:568) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:342) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:473) ```

Sometimes the shutdown completes and the following exception is logged to standard output:

Exception being thrown when the server actually shutdowns without waiting for CTS ``` Apr 26, 2023 1:20:20 PM org.apache.catalina.loader.WebappClassLoaderBase checkStateForResourceLoading INFO: Illegal access: this web application instance has been stopped already. Could not load [org.wrensecurity.guava.common.io.Files]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access. java.lang.IllegalStateException: Illegal access: this web application instance has been stopped already. Could not load [org.wrensecurity.guava.common.io.Files]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access. at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading(WebappClassLoaderBase.java:1384) at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForClassLoading(WebappClassLoaderBase.java:1372) at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1225) at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1187) at org.forgerock.openam.core.rest.docs.api.ApiDocsService.getAsciiDoc(ApiDocsService.java:205) at org.forgerock.openam.core.rest.docs.api.ApiDocsService.access$200(ApiDocsService.java:74) at org.forgerock.openam.core.rest.docs.api.ApiDocsService$2.call(ApiDocsService.java:238) at org.forgerock.openam.core.rest.docs.api.ApiDocsService$2.call(ApiDocsService.java:233) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833) ```
pavelhoral commented 1 year ago

Some additional information:

There are two questions regarding what is happening:

And also the correct solution might be to actually shutdown executor services as the last step... btw. ExtendedExecutorServiceFactory is marked as temporary in its JavaDoc and that might be connected to the issue as well.

karelmaxa commented 1 year ago

The issue has been fixed through #95.