plasma-umass / doppio

Breaks the browser language barrier (includes a plugin-free JVM).
http://plasma-umass.github.io/doppio-demo
MIT License
2.16k stars 174 forks source link

"A non running thread has an expired quantum" #458

Open hrj opened 8 years ago

hrj commented 8 years ago

When running doppio in the browser (fast-dev version), we often get this message on the console:

Error: Uncaught Error: Assertion failed: A non-running thread has an expired quantum?

This assertion is in threadpool.ts

I am not sure what the assertion means, but surprisingly, the Java code seems to execute unaffected (except for the error in the console).

What is the assertion checking for, and could it mean that we are doing something wrong in our Java or native code?

jvilk commented 8 years ago

When you launch a Java thread (from Java and in native code), you provide it with a method to run. When that method completes, the thread automatically shuts down and exits. We emulate that process in DoppioJVM.

You are attempting to run another method on a thread that has shut down, violating a core thread invariant. This can cause bugs in programs that react to thread shutdown.

You have a few options to fix this:

jvilk commented 8 years ago

Actually, I was wrong; I'm referring to a different error.

Your code is running a task on a thread that is not in the RUNNING state, so you are accidentally running multiple JVM threads concurrently. This can cause a whole host of problems, such as deadlocks in native code that expect only one thread to be running at once. Internally, Doppio threads are cooperative, so it relies on threads telling the thread pool when they are done running, and listening to the thread pool when it says it can/cannot run.

jvilk commented 8 years ago

Are you calling thread.run directly? If so, you should listen to the JSDoc for that method!

jimfb commented 8 years ago

I don't think we're calling thread.run directly or anything like that. I think this error started appearing when we configured doppio to be much more responsive (we are using https://github.com/plasma-umass/doppio/pull/407)

hrj commented 8 years ago

Yeah, we aren't calling thread.run directly. Maybe we are triggering it indirectly, say by calling thread.asyncReturn() more than once from a native method? Though, wouldn't such a mistake get caught by a more direct assertion, at some point where the thread's state changes?

About responsiveness, we have set it to a value of 10ms from the default of 1000ms. So the scheduling happens about 100x more often. Could it be that we are exposing a latent bug in the scheduler, simply by running it more often?

Out of curiosity I changed the client/test_runner.ts to add responsiveness: 10, and the unit tests sometimes failed with:

Failed classes/test/Inheritance: Uncaught error. Aborting further tests.
        TypeError: Cannot read property 'run' of undefined

TypeError: Cannot read property 'run' of undefined
    at Immediate._onImmediate (/home/ubuntu/dopp/doppio/src/threadpool.ts:83:9)
    at processImmediate [as _immediateCallback] (timers.js:383:17)