br1ghtyang / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

Failure on loading data into asterix (sporadic) #549

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. deploy asterix using managix on a cluster
2. loading data through webUI

What is the expected output? What do you see instead?
The data should be successfully loaded, however I got the following error:

pin called on a fileId -1 that has not been created. [HyracksDataException]

Please use labels and text to provide additional information.
This error is not always there; I have successfully loaded the dataset with the 
same procedure before. The log files are attached. 

Since the exception seems related to the index bulkload, I assign this to 
Sattam. Sattam please feel free to reassign this to the right person.

Original issue reported on code.google.com by jarod...@gmail.com on 3 Jul 2013 at 2:57

Attachments:

GoogleCodeExporter commented 8 years ago
Clearly we have an issue lurking somewhere in loading-land...  I wonder if 
looking for a pin in a stacktrace will be as hard as finding a needle in a 
haystack?  :-)

Original comment by dtab...@gmail.com on 3 Jul 2013 at 10:52

GoogleCodeExporter commented 8 years ago
Now I can see this problem during the loading consistently. I am not sure this 
is bug or not, as it is on a version (of hyracks) I hand-crafted (I introduced 
a bug into the group-by operator; let me call it the "group-by-bug") in order 
to reproduce a failure of asterix. I supposed that this "group-by-bug" will not 
change anything during the loading but only the group-by. But from my 
experiments once  this "group-by-bug" is used, the loading will fail with the 
posted exception, and it will go if I removed the "group-by-bug". 

Specifically, in order to reproduce this issue, what I have done is to use the 
jarodwen/features/hyracks_agg_bench, and introduce the "group-by-bug" by 
changing the line 230 in HybridHashGroupHashTable.java from:

        return (int) Math.ceil((double)tableSize / headerEntryPerFrame);

into

        return (int) Math.ceil(tableSize / headerEntryPerFrame);

Basically this is a bug on the integer rounding, which would cause an incorrect 
hash table header page count, and later it will cause the 
"ArrayIndexOutOfBound" exception when using the hybrid-hash-gby algorithm.

Then I deployed the asterix (by replacing the asterix-server under managix 
home) onto a 5 nodes cluster (1 CC, 4 NCs), and loading a 11GBx4 data set (each 
NC has a 11GB dataset). The data set just contains two fields as 

create type UserVisitType as closed {
    ip: string,
    revenue: double
}

During the loading the posted error will happen.

So based on the case that this error only happens on this "hand-crafted-buggy" 
hyracks version, I would like to archive this bug as it is not a bug blocking 
anything. @sattam I will downgrade this to low priority so that it is just here 
for our reference, and probably later we can mark it as invalid if no further 
information/observation about this bug will be posted?

Original comment by jarod...@gmail.com on 4 Jul 2013 at 10:00

GoogleCodeExporter commented 8 years ago
Here is a sample data. Basically the first field is a IPv6 string, and the 
second field is a random double number within [1, 1000):

0000:0000::2001|921.48
0001:0000::2001|675.32
0002:0000::2001|865.06
0003:0000::2001|780.79
0004:0000::2001|975.74
0005:0000::2001|12.11
0006:0000::2001|327.11
0007:0000::2001|389.67
0008:0000::2001|408.49
0009:0000::2001|513.29

And for each file I have 500000000 unique (all different IP address) records. 
The load statement I am using is like this:

drop dataverse AggBench if exists;
create dataverse AggBench;
use dataverse AggBench;

create type UserVisitType as closed {
    ip: string,
    revenue: double
}

create dataset UserVisit(UserVisitType)
primary key ip;

load dataset UserVisit using localfs
(("path"="dblab-rack11:///home/jwen003/castleKeep/ukoback/data/us_500000000.dat,
dblab-rack12:///home/jwen003/castleKeep/ukoback/data/us_500000000.dat,dblab-rack
14:///home/jwen003/castleKeep/ukoback/data/us_500000000.dat,dblab-rack15:///home
/jwen003/castleKeep/ukoback/data/us_500000000.dat"), 
("format"="delimited-text"), ("delimiter"="|"));

Original comment by jarod...@gmail.com on 5 Jul 2013 at 1:46

GoogleCodeExporter commented 8 years ago
It seems you are indeed loading incorrect data which has duplicates, since 
every IP address appears 4 times; one in each of the 4 files that you generated.

A duplicate key exception is thrown by the bulkloader, but it is not handled 
properly, which is probably related to issue 106 in Hyracks: 
https://code.google.com/p/hyracks/issues/detail?id=106
Anyway, I have identified the issue, and fixed it in my local branch, and 
should be merge to master once it has been reviewed.

Original comment by salsuba...@gmail.com on 5 Jul 2013 at 6:25

GoogleCodeExporter commented 8 years ago

Original comment by salsuba...@gmail.com on 8 Jul 2013 at 9:26