lwhay / hyracks

Automatically exported from code.google.com/p/hyracks
Apache License 2.0
0 stars 1 forks source link

BufferCache$CleanerThread still running after NodeController.shutdown() #128

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Start an in-memory CC and NC that use a BTree (I'm running pregelix jobs)
2. Shut down the CC and NC using ClusterControllerService.stop() and 
NodeControllerService.stop()

What is the expected output? What do you see instead?

Once I've `.stop()`ed the CC and NC and my main thread is dead, I expect the 
program to exit.  Instead I have to shut it down manually.

What version of the product are you using? On what operating system?

We're developing genomix against a master last merged in 
31a29b74b2afd45d6e0f0910d55818252260bed0

Please provide any additional information below.

After all the `stop()`s and the main thread exits, I see many daemon threads 
that are still active and **one thread** that is not a daemon.  If this thread 
were daemonized, then whole process would have stopped and exited successfully.

Here is that non-daemon thread's stack trace:

Thread [Thread-120] (Suspended) 
waiting for: BufferCache$CleanerThread  (id=25433)  
Object.wait(long) line: not available [native method] [local variables 
unavailable]    
PreDelayPageCleanerPolicy.notifyCleanCycleStart(Object) line: 31    
BufferCache$CleanerThread.run() line: 570   
ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) line: 1145  
ThreadPoolExecutor$Worker.run() line: 615 [local variables unavailable] 
Thread.run() line: 722 [local variables unavailable] 

As you can see, it's not the BufferCache$CleanerThread that isn't daemonized 
but rather the ThreadPoolExecutor that isn't daemonized.

The object that the PreDelayPageCleanerPolicy is waiting on is actually the 
BufferCache... so perhaps the buffer cache needs to be notified?  AFAICT, the 
BufferCache is stuck on this line:

pageCleanerPolicy.notifyCleanCycleStart(this);

where that function waits on the BufferCache.  Seems like a thread shouldn't 
wait on an object that belongs to itself-- the notifyCleanCycleStart should be 
in a different thread than the BufferCache, right?

Anyway, just my musings.

Original issue reported on code.google.com by jake.biesinger@gmail.com on 6 Nov 2013 at 11:32

GoogleCodeExporter commented 9 years ago

Original comment by jake.biesinger@gmail.com on 6 Nov 2013 at 11:33

GoogleCodeExporter commented 9 years ago
I should mention that calling `System.exit()` shuts everything down fine.

Original comment by jake.biesinger@gmail.com on 6 Nov 2013 at 11:46

GoogleCodeExporter commented 9 years ago
I found the NC.ncAppEntryPoint wasn't being stopped when the NC is.  I have 
pushed a fix in 5204349182e8 but there's a problem: buffer cache pages are now 
being closed while still pinned... 

Original comment by jake.biesinger@gmail.com on 7 Nov 2013 at 2:15