Unable to start and stop multiple times using Bootstrapper

windmueller commented 8 years ago

In my unit tests I would like to start and stop several RESTHeart instances. The Bootstrapper class seems to be the perfect choice for this task, but it does not work as expected. Here is the corresponding log output:

INFO  org.restheart.Bootstrapper - RESTHeart started
INFO  org.restheart.Bootstrapper - Stopping RESTHeart...
INFO  org.restheart.Bootstrapper - Waiting for pending request to complete (up to 1 minute)...
INFO  org.restheart.Bootstrapper - Closing MongoDB client connections...
INFO  org.restheart.Bootstrapper - Removing the pid file /tmp/restheart-0.pid
INFO  org.restheart.Bootstrapper - Cleaning up temporary directories...
INFO  org.restheart.Bootstrapper - RESTHeart stopped

Process finished with exit code 254

Please notice the exit code at the end, it prevents any other test cases from working properly.

mkjsix commented 8 years ago

This seems to be the problem:

Surefire fails with the message "The forked VM terminated without properly saying goodbye". Surefire does not support tests or any referenced libraries calling System.exit() at any time. If they do so, they are incompatible with Surefire and you should probably file an issue with the library/vendor. Alternatively the forked VM could also have crashed for a number of reasons. Look for the classical "hs_err*" files indicating VM crashes or examine the Maven log output when the tests execute. Some "extraordinary" output from crashing processes may be dumped to the console/log. If this happens on a CI environment and only after it runs for some time, there is a fair chance your test suite is leaking some kind of OS-level resource that makes things worse at every run. Regular OS-level monitoring tools may give you some indication.

Rif: http://maven.apache.org/surefire/maven-surefire-plugin/faq.html#vm-termination

We probably have to get rid of any unnecessary System.exit call

From Oracle's Portability Cookbook: 7. Pitfall: Misuse of System.exit Explanation: The System.exit method forces termination of all threads in the Java virtual machine. This is drastic. It might, for example, destroy all windows created by the interpreter without giving the user a chance to record or even read their contents. Solution: Programs should usually terminate by stopping all non-daemon threads; in the simplest case of a command-line program, this is as easy as returning from the main method. System.exit should be reserved for a catastrophic error exit, or for cases when a program is intended for use as a utility in a command script that may depend on the program’s exit code.

Rif: http://www.oracle.com/technetwork/java/100percentpurejavacookbook-4-1-1-150165.pdf

ujibang commented 8 years ago

does this apply also using Shutdowner.shutdown() method?

mkjsix commented 8 years ago

Yes. I replaced Bootstrapper.shutdown(); with Shutdowner.shutdown(null);

Caused by: java.lang.RuntimeException: The forked VM terminated without saying properly goodbye. VM crash or System.exit called ?
    at org.apache.maven.plugin.surefire.booterclient.output.ForkClient.close(ForkClient.java:257)

But if I remove the System.exit() calls I can verify the real source of the problem, which is a java.net.BindException: Address already in use

Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 3.426 sec <<< FAILURE! - in org.restheart.BootstrapperTest
testMultipleStartStop(org.restheart.BootstrapperTest)  Time elapsed: 3.041 sec  <<< ERROR!
java.lang.RuntimeException: java.net.BindException: Address already in use
    at org.restheart.BootstrapperTest.testMultipleStartStop(BootstrapperTest.java:17)
Caused by: java.net.BindException: Address already in use
    at org.restheart.BootstrapperTest.testMultipleStartStop(BootstrapperTest.java:17)

So while shutting-down the bound address must be released.

mkjsix commented 8 years ago

Found the solution by calling io.undertow.Undertow stop() in the Bootstrapper.stopServer() method. It's not well documented, but the stop() method should really exit the HTTP server and release the bound address, so that a successive start() call can succeed.

mkjsix commented 8 years ago

Hi @stovocor, please tell us if the fix solves your issue. I have just pushed it on master. Thank you.

windmueller commented 8 years ago

The original issue is gone, but I am still unable to use the different instances in my test case. From what it looks like, a connection to RESTHeart is not possible after the second start. I tried to write a test case with

Unirest.get("http://localhost:8080").asString();

but then I receive a totally different exception:

[XNIO-1 I/O-1] ERROR org.xnio.listener - XNIO001007: A channel event listener threw an exception
java.lang.NoSuchMethodError: io.undertow.server.HttpServerExchange.<init>(Lio/undertow/server/ServerConnection;J)V
    at io.undertow.server.protocol.http.HttpReadListener.handleEventWithNoRunningRequest(HttpReadListener.java:180)
    at io.undertow.server.protocol.http.HttpReadListener.handleEvent(HttpReadListener.java:131)
    at io.undertow.server.protocol.http.HttpOpenListener.handleEvent(HttpOpenListener.java:145)
    at io.undertow.server.protocol.http.HttpOpenListener.handleEvent(HttpOpenListener.java:92)
    at io.undertow.server.protocol.http.HttpOpenListener.handleEvent(HttpOpenListener.java:51)
    at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92)
    at org.xnio.ChannelListeners$10.handleEvent(ChannelListeners.java:291)
    at org.xnio.ChannelListeners$10.handleEvent(ChannelListeners.java:286)
    at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92)
    at org.xnio.nio.QueuedNioTcpServer$1.run(QueuedNioTcpServer.java:128)
    at org.xnio.nio.WorkerThread.safeRun(WorkerThread.java:580)
    at org.xnio.nio.WorkerThread.run(WorkerThread.java:464)

Additionally it seems that my test case requires a running MongoDB instance. Should this be started automatically in the test?

mkjsix commented 8 years ago

If I look at the test case you committed, then you are not using different instances, but starting and stopping within the same JVM. I made a quick search with that error message but I've found nothing relevant so far, we need to search more.

I agree it would be best to automatically start MongoDB automatically for integration tests, this is what our travis-ci configuration actually does. For local tests I'm keen on using docker for this purpose. As we're very busy now, we'll have a look at that in the following days, but feel free to submit a new PR for this meanwhile.

windmueller commented 8 years ago

With instances I meant RESTHeart processes started by the Bootstrapper in the same JVM. Sorry for the confusion.

mkjsix commented 8 years ago

Ok, I assume you are running some integration tests. But in that case, if you start and stop the Restheart instance in the same JVM between two groups of tests, your tests are not isolated by definition. For example, in our production environment we usually run Restheart within docker containers, and when we shut down the Restheart instance we not only shutdown it and the JVM but we even destroy the whole docker container. Any successive start is perfectly clean, as we create a brand new docker container for it. That's why we have never encountered the same issues, even the one we fixed yesterday: our test and production runs are always totally stateless.

windmueller commented 8 years ago

In my understanding, two RESTHeart instances with the same configuration should not differ at all in our environment, since we are only using stateless calls. That is why we can circumvent the current issue by starting RESTHeart once per test class.

However, in my opinion it should be possible to use the start and stop methods of the Bootstrapper class multiple times without any unexpected side effects.

ujibang commented 8 years ago

Hi @stovocor,

if you run the integration tests (mvn verify -DskipITs=false) you'll notice that RESTHeart is started (only once) automatically before tests execution (of course mongodb must be running as well).

This is basically achieved by passing the --fork option to restheart (note this option is not supported on windows) which runs restheart as a daemon

You might want to give our pom.xml a look to check how we have automated this.

mkjsix commented 8 years ago

I'm closing this for lack of activity. Feel free to re-open in case.

SoftInstigate / restheart

Unable to start and stop multiple times using Bootstrapper #143