simlaudato / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

Storage: Exception in inserting a tuple from a frame causes the insert of all tuples from the frame to be rolled back. #743

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Description:
Consider a frame with N tuples that need to be inserted into an index. An 
exception in processing the k'th tuple (a reason for exception could be 
duplicate key) escalates upwards and triggers a roll back. The the k-1 tuples 
from the frame fail to make to the index.  So either the whole frame goes in or 
none of it. The atomicity is thus at the frame level and not at the tuple level.

Conforming that this is not by design. 

Original issue reported on code.google.com by RamanGro...@gmail.com on 21 Mar 2014 at 3:24

GoogleCodeExporter commented 8 years ago
Since our transaction model provides record-level atomicity, this scenario that 
you described is *not* incorrect (aka it is correct). Ensuring atomicity at a 
higher granularity than record level will always be "correct" (in this case 
that granularity frame-level).

Original comment by zheilb...@gmail.com on 21 Mar 2014 at 5:32

GoogleCodeExporter commented 8 years ago
You could have the atomicity at the dataset level as well and that would also 
be "correct" as it still provides you protection from inconsistent record-level 
data. 

However, the behavior I have described is not "ideal" and hence needs to be 
fixed in my view. I see no reason for rejecting (reverting) the other tuples 
from the frame when a problematic tuple exists in that frame. 

re-opening this for re-consideration :)

Original comment by RamanGro...@gmail.com on 21 Mar 2014 at 5:44

GoogleCodeExporter commented 8 years ago
I think there is an easy way to do this. The idea is that when the operator 
performing the insert finds this troublesome tuple, it should record the 
exception but not throw it right away (unless it is the first tuple). It should 
then propagate only the tuples before this one. when the next frame comes or 
hyracks try to close the node, it should then throw the exception.

Original comment by bamou...@gmail.com on 21 Mar 2014 at 6:56

GoogleCodeExporter commented 8 years ago
Reassigning to Raman since he will implement the design we just discussed.

@Raman: Maybe you can detail here what the decision was so that we can refer 
back to it as needed?

Original comment by zheilb...@gmail.com on 21 Mar 2014 at 7:20

GoogleCodeExporter commented 8 years ago
We agreed that the right way going forward would be to successfully process the 
first k-1 tuples from a frame when the k'th tuple has caused an exception.

To do the above, the operator (AsterixLSMTreeInsert...) has to handle a 
generated exception in a way that flushes the tuples before the exception 
causing tuples are packed in a frame and flushed downstream. I had discussed 
the logic to do so and shared the small code snippet with you all. I would make 
this change on the Asterix front.

However, doing above does not solve the issue completely! We have an existing 
issue in index management wherein incorrect reference counting results in 
system being unusable. We have three other bugs logged that track this 
incorrect behavior. We decided to fix this reference counting issue as without 
it, the other changes on the Asterix front would not work. Nevertheless, fixing 
this would help close the other three high priority bugs as well.

Re-assigning for the fix for the index management bug (incorrect reference 
counting)

Original comment by RamanGro...@gmail.com on 2 Apr 2014 at 10:01