br1ghtyang / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

Current log manger is deadlocking and not thread-safe #584

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
This is issue is related to issue 583, but documenting this anyway to keep 
track of the current issues:

The current log manager cause a sporiadic deadlock when there are concurrent 
access. The deadlock can be observed from the two following htread traces:

"edu.uci.ics.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:1:0:1
4:0:0" daemon prio=10 tid=0x0000000012370000 nid=0x4a7a waiting on condition 
[0x00002ad8d393e000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000064b36ef38> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
    at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
    at edu.uci.ics.asterix.common.transactions.FileBasedBuffer.acquireReadLatch(FileBasedBuffer.java:223)
    at edu.uci.ics.asterix.transaction.management.service.logging.LogManager.getLsn(LogManager.java:270)
    at edu.uci.ics.asterix.transaction.management.service.logging.LogManager.log(LogManager.java:337)
    - locked <0x000000064b493690> (a edu.uci.ics.asterix.transaction.management.service.transaction.TransactionContext)
    at edu.uci.ics.asterix.transaction.management.service.logging.IndexLogger.generateLogRecord(IndexLogger.java:122)
    at edu.uci.ics.asterix.transaction.management.opcallbacks.PrimaryIndexModificationOperationCallback.found(PrimaryIndexModificationOperationCallback.java:77)
    at edu.uci.ics.hyracks.storage.am.btree.impls.BTree.insertLeaf(BTree.java:362)
    at edu.uci.ics.hyracks.storage.am.btree.impls.BTree.upsertLeaf(BTree.java:497)
    at edu.uci.ics.hyracks.storage.am.btree.impls.BTree.performOp(BTree.java:727)
    at edu.uci.ics.hyracks.storage.am.btree.impls.BTree.performOp(BTree.java:630)
    at edu.uci.ics.hyracks.storage.am.btree.impls.BTree.performOp(BTree.java:630)
    at edu.uci.ics.hyracks.storage.am.btree.impls.BTree.insertUpdateOrDelete(BTree.java:286)
    at edu.uci.ics.hyracks.storage.am.btree.impls.BTree.upsert(BTree.java:331)
    at edu.uci.ics.hyracks.storage.am.btree.impls.BTree.access$500(BTree.java:68)
    at edu.uci.ics.hyracks.storage.am.btree.impls.BTree$BTreeAccessor.upsertIfConditionElseInsert(BTree.java:895)
    at edu.uci.ics.hyracks.storage.am.lsm.btree.impls.LSMBTree.insert(LSMBTree.java:312)
    at edu.uci.ics.hyracks.storage.am.lsm.btree.impls.LSMBTree.modify(LSMBTree.java:264)
    at edu.uci.ics.hyracks.storage.am.lsm.common.impls.LSMHarness.modify(LSMHarness.java:131)
    at edu.uci.ics.hyracks.storage.am.lsm.common.impls.LSMHarness.modify(LSMHarness.java:122)
    at edu.uci.ics.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.insert(LSMTreeIndexAccessor.java:42)
    at edu.uci.ics.asterix.common.dataflow.AsterixLSMInsertDeleteOperatorNodePushable.nextFrame(AsterixLSMInsertDeleteOperatorNodePushable.java:58)
    at edu.uci.ics.hyracks.control.nc.Task.pushFrames(Task.java:304)
    at edu.uci.ics.hyracks.control.nc.Task.run(Task.java:261)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)

"Thread-9" prio=10 tid=0x00002ad8d414c000 nid=0x4a1e in Object.wait() 
[0x00002ad8c76ee000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    at java.lang.Object.wait(Object.java:503)
    at edu.uci.ics.hyracks.storage.am.lsm.common.impls.AbstractMutableLSMComponent.threadEnter(AbstractMutableLSMComponent.java:72)
    - locked <0x000000064b36bc28> (a edu.uci.ics.hyracks.storage.am.lsm.btree.impls.LSMBTreeMutableComponent)
    at edu.uci.ics.hyracks.storage.am.lsm.common.impls.LSMHarness.getAndEnterComponents(LSMHarness.java:70)
    at edu.uci.ics.hyracks.storage.am.lsm.common.impls.LSMHarness.scheduleFlush(LSMHarness.java:172)
    at edu.uci.ics.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.scheduleFlush(LSMTreeIndexAccessor.java:115)
    at edu.uci.ics.asterix.common.context.PrimaryIndexOperationTracker.flushIfFull(PrimaryIndexOperationTracker.java:92)
    at edu.uci.ics.asterix.common.context.PrimaryIndexOperationTracker.completeOperation(PrimaryIndexOperationTracker.java:81)
    at edu.uci.ics.asterix.transaction.management.service.transaction.TransactionContext.decreaseActiveTransactionCountOnIndexes(TransactionContext.java:110)
    - locked <0x000000064b4938a0> (a java.util.HashSet)
    at edu.uci.ics.asterix.transaction.management.service.logging.LogManager.decrementActiveTxnCountOnIndexes(LogManager.java:790)
    at edu.uci.ics.asterix.transaction.management.service.logging.LogPageFlushThread.run(LogManager.java:1014)
    - locked <0x000000064b36eda0> (a edu.uci.ics.asterix.common.transactions.FileBasedBuffer)

Moreover, the List<HashMap<ITransactionContext, Integer>> activeTxnCountMaps in 
the log manager is not protected while being modified by concurrent threads.

Original issue reported on code.google.com by salsuba...@gmail.com on 27 Jul 2013 at 8:08

GoogleCodeExporter commented 8 years ago
Issue 585 has been merged into this issue.

Original comment by salsuba...@gmail.com on 29 Jul 2013 at 11:21

GoogleCodeExporter commented 8 years ago
Issue 585 has been merged into this issue.

Original comment by RamanGro...@gmail.com on 29 Jul 2013 at 11:58

GoogleCodeExporter commented 8 years ago
Issue 585 has been merged into this issue.

Original comment by salsuba...@gmail.com on 30 Jul 2013 at 3:16

GoogleCodeExporter commented 8 years ago
Issue 585 has been merged into this issue.

Original comment by RamanGro...@gmail.com on 30 Jul 2013 at 3:36

GoogleCodeExporter commented 8 years ago
Issue 585 has been merged into this issue.

Original comment by salsuba...@gmail.com on 30 Jul 2013 at 4:02

GoogleCodeExporter commented 8 years ago
Issue 585 has been merged into this issue.

Original comment by RamanGro...@gmail.com on 30 Jul 2013 at 4:12

GoogleCodeExporter commented 8 years ago
Issue 585 has been merged into this issue.

Original comment by salsuba...@gmail.com on 30 Jul 2013 at 4:30

GoogleCodeExporter commented 8 years ago
fixed in the following revision.
https://code.google.com/p/asterixdb/source/detail?r=46f0815add2bca05a72579271b07
757616d94071

Original comment by kiss...@gmail.com on 23 Aug 2013 at 10:18