Insertion into the a dataset hangs

GoogleCodeExporter commented 8 years ago

What steps will reproduce the problem?
Reported by Raman:

Using a single feed ingestion node with one nc and one io device, the system 
hangs after few mins.

Original issue reported on code.google.com by salsuba...@gmail.com on 27 Jul 2013 at 8:09

GoogleCodeExporter commented 8 years ago

Raman,

Can you provide a way to reproduce this issue?
I cannot reproduce it in my local branch. I also have tried your branch and I 
cannot reproduce it there either.

Original comment by salsuba...@gmail.com on 28 Jul 2013 at 4:26

GoogleCodeExporter commented 8 years ago

following are some of the ways to reproduce the issue:-

1) On I-Sensorium 10 node cluster
  Start the feed ingestion pipeline, specifying ingestion-cardinality as 10. This places an ingestion node per machine.  You would soon observe starvation with a single ingestion node able to insert while others do not progress. Upon waiting for a while (~ 2 mins) the lone active ingestion node also stalls resulting in an overall ingestion throughput to be zero. The throughput continues to stay at zero for a prolonged indefinite duration. 

2) On a single node pseduo-distributed cluster.
You may repeat the experiment on a single machine with two NCs. Set 
ingestion-cardinality as 2. You would observe a similar starvation as in (1) 
and eventual drop of ingestion throughput to be zero. 

I have tried in eager (push as much as you can) and controlled mode (push at a 
specified rate (TPS)).
In the controlled mode, the behavior is not observed for low values of TPS (~ 
150) but shows up for reasonable values of  ~ 5k. May be in the former case, I 
did not wait long enough before the total throughput drops to zero.

Original comment by ram...@uci.edu on 29 Jul 2013 at 10:49

GoogleCodeExporter commented 8 years ago

I see. I thought it was a 1 ingestion node experiment.
I think this is same deadlock issue that I observed and reported in issue 584.
So, I will mark this issue as duplicate.

Original comment by salsuba...@gmail.com on 29 Jul 2013 at 11:21

Changed state: Duplicate

GoogleCodeExporter commented 8 years ago

The number of ingestion nodes should not matter here as insert is sequential, 
right?

Original comment by RamanGro...@gmail.com on 29 Jul 2013 at 11:58

GoogleCodeExporter commented 8 years ago

How come it would be sequential? there will be multiple insert operator 
instances since there are multiple partitions, right?

Original comment by salsuba...@gmail.com on 30 Jul 2013 at 3:16

GoogleCodeExporter commented 8 years ago

We are not considering a  distributed deadlock.
The deadlock exists in a single VM where there would be a single insert 
operator instance receiving frames and interacting with a single log manager. I 
have *not* stated multiple io devices on a single machine. 
There is one partition per IO device. Since the number of IO devices per 
machine is 1, there is a single insert operator at each machine/JVM. Processing 
would be sequential. 

I am not sure how would there be a deadlock in above case?

Original comment by RamanGro...@gmail.com on 30 Jul 2013 at 3:36

GoogleCodeExporter commented 8 years ago

Is not the fire hose adapter work as any other insert pipeline? i.e, each 
record will be hash partitioned and goes to the right partition? 

It is enough to have one deadlock situation in one of the NCs between the log 
manager in that NC and one of the inserted records. Once the deadlock occurred, 
all other ingestion nodes will quickly be trapped in that partition too, when 
their corresponding records go there since they cannot use the log manager 
either, causing the throughput to stall and the system will be in a hung state.

Original comment by salsuba...@gmail.com on 30 Jul 2013 at 4:02

GoogleCodeExporter commented 8 years ago

I am not following the reason for deadlock at a single NC ( I agree that if at 
all deadlock happens it will cause the whole ingestion activity to stall). 
Should I wait for YS to put more description in the original issue that 
explains the reason for a deadlock? 
Also 584 describes the deadlock to be "sporadic" but this issue is 
deterministic. Are we jumping the gun in concluding that a sporadic deadlock is 
causing a deterministically reproducible issue?

Original comment by RamanGro...@gmail.com on 30 Jul 2013 at 4:12

GoogleCodeExporter commented 8 years ago

I know how and why the deadlock is happening, and I can explain it to you.

One inserter is talking with log manager to append a log record before it can 
insert into the B-tree. In order to get an LSN, it is trying to acquire a read 
latch competing with flusher thread.

On the other hand, the flusher thread of the log manager has already acquired a 
write latch to flush the log records to disk. At the end of this write-latched 
block (after the log pages made it to disk), we decrement the active 
transaction counter for each commit record that made it to disk. It turned out 
that the memory component is already full, and thus it tries to schedule a 
flush operation. However, the flush cannot proceed because the write-reference 
count on the mutable component is not zero, causing the flusher thread to halt 
while holding the write latch.

Note that there are two bugs in the above scenario contributing to the 
deadlock. One of them has been fixed in my local branch (issue 587). The other 
issue is the decrement of the active transaction count should be outside the 
latched block in the flusher thread. This should be fixed soon when YS finishes 
the re-implementation of the log manager.

Now to be 100% sure that what are you seeing is the indeed this issue, can you 
provide the stack trace of the NCs processes?

Original comment by salsuba...@gmail.com on 30 Jul 2013 at 4:30

GoogleCodeExporter commented 8 years ago

Reopening this issue until confirmed it is indeed a duplicate of issue 584, 
since Raman is observing it happening deterministically, while issue 584 seems 
to be sporadic from my end. The verification can be done by looking at Raman's 
stack trace.

Original comment by salsuba...@gmail.com on 31 Jul 2013 at 1:19

Changed state: Accepted

GoogleCodeExporter commented 8 years ago

please find attached the NC1/2 dump produced at a point where system appeared 
to be in a hung state with zero observed throughput.

Original comment by RamanGro...@gmail.com on 1 Aug 2013 at 5:06

Attachments:

NC2
NC1

GoogleCodeExporter commented 8 years ago

Original comment by salsuba...@gmail.com on 4 Oct 2013 at 10:23

Changed state: Fixed

br1ghtyang / asterixdb

Insertion into the a dataset hangs #585