Open adamretter opened 4 years ago
The java.lang.IllegalStateException: Shutdown in progress
error comes from JettyStart#shutdown()
when it tries to removeShutdownHook
.
The stack trace to the JettyStart#shutdown()
function looks like:
java.lang.IllegalStateException: Shutdown in progress
at java.lang.ApplicationShutdownHooks.remove(ApplicationShutdownHooks.java:82) ~[?:1.8.0_252]
at java.lang.Runtime.removeShutdownHook(Runtime.java:239) ~[?:1.8.0_252]
at org.exist.jetty.JettyStart.lambda$9(JettyStart.java:537) ~[classes/:?]
at java.util.Optional.ifPresent(Optional.java:159) [?:1.8.0_252]
at org.exist.jetty.JettyStart.shutdown(JettyStart.java:535) [classes/:?]
at org.exist.test.ExistWebServer.after(ExistWebServer.java:148) [classes/:?]
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:59) [junit-4.13.jar:4.13]
at org.junit.rules.RunRules.evaluate(RunRules.java:20) [junit-4.13.jar:4.13]
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) [junit-4.13.jar:4.13]
at org.junit.runners.ParentRunner.run(ParentRunner.java:413) [junit-4.13.jar:4.13]
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:364) [surefire-junit4-3.0.0-M5.jar:3.0.0-M5]
at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:272) [surefire-junit4-3.0.0-M5.jar:3.0.0-M5]
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:237) [surefire-junit4-3.0.0-M5.jar:3.0.0-M5]
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:158) [surefire-junit4-3.0.0-M5.jar:3.0.0-M5]
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:428) [surefire-booter-3.0.0-M5.jar:3.0.0-M5]
at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162) [surefire-booter-3.0.0-M5.jar:3.0.0-M5]
at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:562) [surefire-booter-3.0.0-M5.jar:3.0.0-M5]
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:548) [surefire-booter-3.0.0-M5.jar:3.0.0-M5]
It looks like the code in JettyStart#shutdown()
should be modified from:
shutdownHookThread.ifPresent(Runtime.getRuntime()::removeShutdownHook);
to:
shutdownHookThread.ifPresent(thread -> {
try {
Runtime.getRuntime().removeShutdownHook(thread);
logger.debug("BrokerPoolsAndJetty.ShutdownHook hook unregistered");
} catch (final IllegalStateException e) {
// Shutdown in progress
logger.warn("Unable to remove BrokerPoolsAndJetty.ShutdownHook hook: " + e.getMessage());
}
});
With the above fix in-place, the next step is that we likely need to talk to the Surefire Plugin developers to get some assistance on how to pinpoint where the issue is in eXist-db.
When looking into the actual JDK code, it seems that java.lang.IllegalStateException: Shutdown in progress
is thrown by the JDK if a shutdown hook is being added or removed when the ApplicationShutdownHooks.runHooks()
is being executed. So you need to handle a possible java.lang.IllegalStateException
in case of a adding or removing.
I could imagine that the Surefire Plugin forks a new JVM to run the tests in and stops it externally after the time out..
When looking into the actual JDK code, it seems that java.lang.IllegalStateException: Shutdown in progress is thrown by the JDK if a shutdown hook is being added or removed when the ApplicationShutdownHooks.runHooks() is being executed. So you need to handle a possible java.lang.IllegalStateException in case of a adding or removing.
Thanks @reinhapa I have fixed that part now, however I still have problems with JmxRemoteTest and Surefire together. I will try and update the description by running all the tests again soon...
@adamretter Reviewing the list of outstanding items during today's Community Call, we wondered: Do you have any updates on this one?
@joewiz No updates, sorry.
Initially discovered in https://github.com/eXist-db/exist/pull/3341
Problem 1 (running single test):
Shows that
JmxRemoteTest
executes and passes 2 tests quickly, but then the command appears to do nothing for a long period of time. This is due to the timeout above being extended. Whilst waiting for the timeout thejstack
for the process reports the following interesting threads:After the command completes, the following log files are available:
exist-core/target/surefire-reports/2020-08-18T16-22-21_644-jvmRun1.dumpstream
:exist-core/target/surefire-reports/org.exist.management.JmxRemoteTest.txt
:Even though all the tests passed, both the large delay until the timeout is met and the
2020-08-18T16-22-21_644-jvmRun1.dumpstream
file indicates that the test did not behave correctly. The test should shutdown nicely without needing to have Surefire wait for a long timeout and then kill the process.Problem 2 (running all tests):
Shows that many tests (including
JmxRemoteTest
) executed ok, but then the command appears to do nothing for a long period of time. This is due to the timeout above being extended. Whilst waiting for the timeout thejstack
for the process reports the following interesting threads:After the command completes, the following log files are available:
exist-core/target/surefire-reports/2020-08-18T16-39-44_995-jvmRun6.dumpstream
:exist-core/target/surefire-reports/org.exist.management.JmxRemoteTest.txt
:Even though all the tests passed, both the large delay until the timeout is met and the
2020-08-18T16-39-44_995-jvmRun6.dumpstream
file indicates that a test did not behave correctly. Each test should shutdown nicely without needing to have Surefire wait for a long timeout and then kill the process.We can offer further evidence that the underlying issue is likely exhibited via the
JmxRemoteTest
by adding the line:to the bottom of
maven-surefire-plugin
's excludes configuration inexist-core/pom.xml
, and then re-running all the tests using the commands given above. When this is done, running the tests exhibits no larger delay (because there is no wait for the timeout), and there are no*.dumpstream
files generated.Problem 3 (running single test / Updated surefire: 3.0.0-M5):
This repeats the process of Problem 1, however the
surefire-maven-plugin
version inexist-parent/pom.xml
is updated from3.0.0-M4
to3.0.0-M5
. This results in a different set of errors:After the command completes, the following log files are available:
exist-core/target/surefire-reports/2020-08-18T17-10-04_899-jvmRun1.dump
:exist-core/target/surefire-reports/2020-08-18T17-10-04_899-jvmRun1.dumpstream
:exist-core/target/surefire-reports/org.exist.management.JmxRemoteTest.txt
:The
java.lang.IllegalStateException: Shutdown in progress
messages seem to be originating from something calling eitherjava.lang.Runtime#addShutdownHook(Thread)
orjava.lang.Runtime#removeShutdownHook(Thread)
.exist-core/target/test-logs-20200818171030221/exist.log
:exist-core/target/test-logs-20200818171030221/urlrewrite.log
:Problem 4 (running all tests / Updated surefire: 3.0.0-M5):
This repeats the process of Problem 2, however the
surefire-maven-plugin
version inexist-parent/pom.xml
is updated from3.0.0-M4
to3.0.0-M5
. This results in a very similar set of errors to Problem 3.Working Theories
Perhaps there is some concurrency problem with eXist-db JMX code, where some shared-state may not correctly synchronized. The JMX code is quite messy, especially the setup of the MBeans and Remote Server (Servlet).
We add/remove JVM Shutdown Hooks in eXist-db from
JettyStart
and BrokerPools`. I think that Surefire also makes use of Shutdown hooks, so perhaps we are interfering with those somehow.java.lang.IllegalStateException: Shutdown in progress
messages.