namhnguyen / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

Incorrect number reading in LinearizeComparators such as HilbertDouble/ZCurveDouble/ZCurveIntComparator #802

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
All tuples in a frame in asterixdb conforms the following tuple format.

     | start offsets of each field | field 1 | field 2 | .... | field n |

     Each field consists of 1) type tag and 2) actual value.

The type tag is one byte long and the size of it's value varies according to 
the type tag.  
(Please let me know if this information is wrong.)

Currently, the compare functions in 
HilbertDouble/ZCurveDouble/ZCurveIntComparators read number value from an 
incorrect offset. More precisely, the functions read number values from the 
type tag. So, they read wrong values.

The reason why this happens is attributed to the fact that those comparators 
are implemented in hyracks level. Hyracks doesn't know the existence of the 
type tag. In contrast, asterix knows the existence of the type tag and takes 
into account the type tag when all comparators implemented in asterix level are 
created in order to skip the type tag when actual value should be read. So, 
hyracks doesn't need to deal with the type tag at all in it's own way during 
runtime when comparators implemented in asterix read values. 

Two possible options to fix this issue:
option 1) Move the implementation of the linearize comparators to asterix 
level. 
option 2) take care of the skipping the type tag when the comparators are 
created in a similar way that all existing binary comparators in asterix level 
are taken care of. 

Original issue reported on code.google.com by kiss...@gmail.com on 24 Sep 2014 at 6:02

GoogleCodeExporter commented 9 years ago
[tuple format correction] 
In the tuple format, instead of having the start offsets of each field, end 
offsets of each field are stored.

Original comment by kiss...@gmail.com on 24 Sep 2014 at 6:04

GoogleCodeExporter commented 9 years ago
Actually, the statement ("The reason why this happens is attributed to the fact 
that those comparators are implemented in hyracks level.") is not correct. 
Regardless of the fact that where the comparators are implemented, the caller 
who creates the comparators should take care of the type tag. So, the "option 
1" is not an option to fix this issue.

Original comment by kiss...@gmail.com on 24 Sep 2014 at 6:15

GoogleCodeExporter commented 9 years ago
Isn't the field type tag only written for variable length entries? Since 
neither int nor double are variable length types, their 
ITypeTraits.isFixedLength() should be false, causing the 
RTreeTypeAwareTupleWriter to not write the field length.

Thus, based on the source code, I believe this is not an issue. If this is 
correct, I still agree that we should put down that implicit assumption 
somewhere - probably as an assertion in the ILinearizeComparators.

Original comment by mdrese...@googlemail.com on 24 Sep 2014 at 9:18

GoogleCodeExporter commented 9 years ago
Currently, regardless of the field type, the type tag is written. Also, whether 
a field is of variable length or not doesn't matter in keeping the type tag.  

Original comment by kiss...@gmail.com on 24 Sep 2014 at 9:29

GoogleCodeExporter commented 9 years ago
Could you point me to where it's written?

Original comment by mdrese...@googlemail.com on 24 Sep 2014 at 9:32

GoogleCodeExporter commented 9 years ago
You can take a look at AqlSerializerDeserializerProvider.java.
Especially, you can read "addTag()" function in it. 

Original comment by kiss...@gmail.com on 24 Sep 2014 at 10:48

GoogleCodeExporter commented 9 years ago
Ah, ok. I think I got it. So this is a problem that appears only when passing 
Asterix frames (i.e., IAObjects) down to the comparators. If we use 
RTreeNSMFrames, this is different and we must not expect the tag.

My gut feeling is that the comparator shouldn't know about the existence of 
different types of frames. Instead, the frames should have a method that 
returns the value at position x.

Original comment by mdrese...@googlemail.com on 24 Sep 2014 at 11:07

GoogleCodeExporter commented 9 years ago

Original comment by kiss...@gmail.com on 29 Sep 2014 at 5:02