NS_ERROR_ILLEGAL_DURING_SHUTDOWN on parentWindow.openDialog in slLauncher.openBrowser

mykmelez commented 9 years ago

I'm seeing a lot of intermittent test failures, like https://travis-ci.org/andreasgal/j2me.js/builds/44157722. It looks like the problem is in TestAudioPlayer.read:

Update event: closed null java.io.IOException:

java/lang/Throwable.(), bci=8

java/lang/Exception.(), bci=5

java/io/IOException.(), bci=5

java/io/IOException.RaiseExceptionSynthetic(), bci=9

javax/microedition/media/TestAudioPlayer.read(), bci=4

javax/microedition/media/TestAudioPlayer.test(), bci=104

RunTests.runTest(), bci=115

RunTests.startApp(), bci=166

javax/microedition/midlet/MIDletTunnelImpl.callStartApp(), bci=4

com/sun/midp/midlet/MIDletPeer.startApp(), bci=12

com/sun/midp/midlet/MIDletStateHandler.startSuite(), bci=252

com/sun/midp/main/AbstractMIDletSuiteLoader.startSuite(), bci=54

com/sun/midp/main/CldcMIDletSuiteLoader.startSuite(), bci=11

com/sun/midp/main/AbstractMIDletSuiteLoader.runMIDletSuite(), bci=146

com/sun/midp/main/MIDletSuiteLoader.main(), bci=15

There are also some socket-related exceptions in the logs, like this one:

java.io.IOException: error opening socket

java/lang/Throwable.(), bci=8

java/lang/Exception.(), bci=5

java/io/IOException.(), bci=5

java/io/IOException.RaiseExceptionSynthetic(), bci=9

com/sun/midp/io/j2me/socket/Protocol.connect(), bci=123

com/sun/midp/io/j2me/socket/Protocol.open(), bci=125

com/sun/midp/io/j2me/socket/Protocol.openPrim(), bci=7

javax/microedition/io/Connector.openPrim(), bci=325

javax/microedition/io/Connector.open(), bci=6

javax/microedition/io/Connector.open(), bci=6

javax/microedition/io/Connector.open(), bci=5

com/sun/midp/ssl/TestSSLStreamConnection.testBasicSSLStreamConnection(), bci=5

com/sun/midp/ssl/TestSSLStreamConnection.test(), bci=9

RunTests.runTest(), bci=115

RunTests.startApp(), bci=166

javax/microedition/midlet/MIDletTunnelImpl.callStartApp(), bci=4

com/sun/midp/midlet/MIDletPeer.startApp(), bci=12

com/sun/midp/midlet/MIDletStateHandler.startSuite(), bci=252

com/sun/midp/main/AbstractMIDletSuiteLoader.startSuite(), bci=54

com/sun/midp/main/CldcMIDletSuiteLoader.startSuite(), bci=11

com/sun/midp/main/AbstractMIDletSuiteLoader.runMIDletSuite(), bci=146

com/sun/midp/main/MIDletSuiteLoader.main(), bci=15

But those are expected, and they shouldn't be causing the failure, since they get caught by their tests (we haven't enabled mozTCPSocket in automation yet, so the tests are all marked todo, and these exceptions are all caught and printed, after which the test harness continues running tests.)

The only other exception in the logs is:

java/lang/RuntimeException: runtime exception

java/lang/TestThread$RuntimeExceptionThread.run(), bci=9

java/lang/TestThread$RuntimeExceptionThread.ThreadStart0Synthetic(), bci=5

Which is also an expected exception (that doesn't cause the test harness to report failure).

So it seems like TestAudioPlayer.read is causing the problems.

I made the most recent change to TestAudioPlayer, @marco-c modified it before me, and @yxl landed the original version.

@marco-c, @yxl: any ideas why the test might be failing? It's been happening for at least a few days, for example it occurred on the merge commit when I merged #680 five days ago. Unsure if it happened before then.

mykmelez commented 9 years ago

Here's a build from 11 days ago that has the same failure:

https://travis-ci.org/andreasgal/j2me.js/builds/43013004

yxl commented 9 years ago

@mykmelez There is a bug in TestAudioPlayer.read to read an InputStream twice. (The InputStream seems to be closed before the second read) That's why TestAudioPlayer.read throws IOException. But the exceptioin is caught in test and marked as todo and it shouldn't cause failure...

mykmelez commented 9 years ago

@yxl Ah, ok, thanks for the clarification! In that case, this intermittent failure is more mysterious, since it isn't associated with any of those reported exceptions.

mykmelez commented 9 years ago

The other interesting part of the log is at the end:

In /home/travis/build/andreasgal/j2me.js/tests/automation.js unit tests fail: Timeout FAIL [Exception... "" nsresult: "0x8046001e ()" location: "JS frame :: resource://slimerjs/slLauncher.jsm :: > slLauncher.openBrowser :: line 186" data: no]> type: uncaughtError> file: /home/travis/build/andreasgal/j2me.js/tests/automation.js> error: {} stack: not provided

That line is https://github.com/laurentj/slimerjs/blob/e6ff89466fe4b28ef40a3f84688ed007e48d7e8c/src/modules/slLauncher.jsm#L186, which looks like:

    return parentWindow.openDialog("chrome://slimerjs/content/webpage.xul", "_blank", features, { callback:callback});

And according to https://developer.mozilla.org/en-US/docs/Mozilla/Errors, the error is:

NS_ERROR_ILLEGAL_DURING_SHUTDOWN (0x8046001E) Many operations cannot be performed once the application is being shutdown. This error will occur in this situation.

So perhaps something is calling slLauncher.openBrowser while slimerjs is shutting down the xulrunner process.

@laurentj Would you happen to know why that could be happening?

laurentj commented 9 years ago

Difficult to say. I should investigate. Perhaps CasperJS (or your test) does something that SlimerJS doesn't like. Probably it expects SlimerJS does a "thing" synchronously (because PhantomJS does it synchronously) whereas it is asynchronous (there are already many such differences between SlimerJS and PhantomJS, that have been "fixed" in SlimerJS). Most of time, "thing" is fast enough to not generate this error. And sometime, it is too long...

The question is what is this "thing" :-) If we find it, we could patch CasperJS, your test or SlimerJS. If we could identify first the faulty test...

(PS: I just discover this project, it's awesome :-))

mykmelez commented 9 years ago

The question is what is this "thing" :-) If we find it, we could patch CasperJS, your test or SlimerJS. If we could identify first the faulty test...

This morning I discovered I could reproduce the failure pretty consistently on one of my Linux VMs, so I bisected it and found that @marco-c's commit a7143dd2e2e1c6420f3cc95417954d8c93b77d5a seems to be the culprit. That commit just adds seven (existing, but previously unused) tests to the list of gfx tests, and the failure seems to happen either during or after the last test in the list (RectAfterText).

Perhaps this is a race condition that gets exposed once we run a certain number of gfx tests (or tests in general), and adding those seven tests was enough to push us over the edge intermittently?

(PS: I just discover this project, it's awesome :-))

Thanks, @laurentj! We owe you a big debt of gratitude, as we wouldn't be nearly as far along without SlimerJS to help us test it!

mozilla / pluotsorbet

NS_ERROR_ILLEGAL_DURING_SHUTDOWN on parentWindow.openDialog in slLauncher.openBrowser #739