br1ghtyang / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

Managix stop fails to terminate #590

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Here is the offending stack trace that is blocking the vm from exiting.

java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    at java.lang.Object.wait(Object.java:503)
    at edu.uci.ics.hyracks.storage.am.lsm.common.impls.BlockingIOOperationCallbackWrapper.waitForIO(BlockingIOOperationCallbackWrapper.java:35)
    - locked <0x0000000772ffe7e0> (a edu.uci.ics.hyracks.storage.am.lsm.common.impls.BlockingIOOperationCallbackWrapper)
    at edu.uci.ics.hyracks.storage.am.lsm.btree.impls.LSMBTree.deactivate(LSMBTree.java:165)
    - locked <0x000000077ad05f40> (a edu.uci.ics.hyracks.storage.am.lsm.btree.impls.LSMBTree)
    at edu.uci.ics.hyracks.storage.am.lsm.btree.impls.LSMBTree.deactivate(LSMBTree.java:187)
    - locked <0x000000077ad05f40> (a edu.uci.ics.hyracks.storage.am.lsm.btree.impls.LSMBTree)
    at edu.uci.ics.asterix.common.context.DatasetLifecycleManager.stop(DatasetLifecycleManager.java:352)
    at edu.uci.ics.hyracks.api.lifecycle.LifeCycleComponentManager.stopAll(LifeCycleComponentManager.java:115)
    - locked <0x0000000600088088> (a edu.uci.ics.hyracks.api.lifecycle.LifeCycleComponentManager)
    at edu.uci.ics.asterix.hyracks.bootstrap.NCApplicationEntryPoint.stop(NCApplicationEntryPoint.java:102)
    at edu.uci.ics.asterix.hyracks.bootstrap.NCApplicationEntryPoint$JVMShutdownHook.run(NCApplicationEntryPoint.java:196)

Original issue reported on code.google.com by cmcintyr...@gmail.com on 29 Jul 2013 at 9:07

GoogleCodeExporter commented 8 years ago
This thread is waiting for the in-memory component of one of your indexes to be 
flushed to disk.

How many indexes do you have? and how big are they? did you perform inserts 
when the asterix instance was active? how long have you waited before killing 
the instance?

Original comment by salsuba...@gmail.com on 29 Jul 2013 at 10:11

GoogleCodeExporter commented 8 years ago

Original comment by vinay...@gmail.com on 2 Aug 2013 at 5:57

GoogleCodeExporter commented 8 years ago
Managix stop command translates to  a kill -15 on all NC JVMs. Each NC executes 
its shutdown hook which involves closing all lifecycle components gracefully. 
Once done the VM exits and Managix stop command returns. 
The reason that Managix stop command is hung is because of an offending thread 
that is blocked and keeping the VM from exiting. This issue is not under the 
scope of Managix and needs to be investigated from the perspective of index 
management that happens as part of NC shutdown hook. 

I would request for reassigning this issue.  

Original comment by ram...@uci.edu on 2 Aug 2013 at 6:17

GoogleCodeExporter commented 8 years ago

Original comment by westm...@gmail.com on 2 Aug 2013 at 8:55

GoogleCodeExporter commented 8 years ago
This issue is very likely caused by the issue reported in issue 600. 

Original comment by salsuba...@gmail.com on 4 Aug 2013 at 4:53

GoogleCodeExporter commented 8 years ago
Is IO still happening? AKA is it truly waiting for the flush to complete or is 
it deadlocked?

How big is the size of the memory component? You can find that in your 
asterix-configuration.xml file:
storage.memcomponent.pagesize * storage.memcomponent.numpages

(This is actually the size of the memory allocation given to each dataset -- 
these property names need to be renamed!)

Original comment by zheilb...@gmail.com on 4 Aug 2013 at 5:33

GoogleCodeExporter commented 8 years ago
On Sattams's questions:

> How many indexes do you have?

Only primary indexes - one per table.

> and how big are they? 

How do we find out?

> did you perform inserts when the asterix instance was active? 

no

> how long have you waited before killing the instance?

tens of seconds - usually we don't have to wait for shutdown after running 
queries, so that seemed reasonable

Original comment by westm...@gmail.com on 5 Aug 2013 at 5:26

GoogleCodeExporter commented 8 years ago
On Zach's questions:

> Is IO still happening?
> AKA is it truly waiting for the flush to complete or is it deadlocked? 

Not sure, we didn't check. How would you check? iostat?

> How big is the size of the memory component? You can find that in your 
asterix-configuration.xml file: storage.memcomponent.pagesize * 
storage.memcomponent.numpages

32768 * 1024

Original comment by westm...@gmail.com on 5 Aug 2013 at 5:29

GoogleCodeExporter commented 8 years ago
Can't reproduce. This seems to only be a problem with an older version of the 
code.

Original comment by zheilb...@gmail.com on 15 Nov 2013 at 7:59