antonmks / Alenka

GPU database engine
Other
1.17k stars 120 forks source link

compression problem #14

Closed songjinrong closed 11 years ago

songjinrong commented 11 years ago

using pfor compression method is to store the array like(original-minimum) but why still need to shift like this: unsigned int shifted = vals[2] - vals[0] - (i%vals[1])*vals[0]; dest[i] = val << shifted;

antonmks commented 11 years ago

Well, as far as I remember this is how we encode the (original-minimum) and put the resulting values in 64bit vector. We find the index of element and we find where in this element we need to store the value. The values must be aligned.

On Wed, Apr 3, 2013 at 6:52 AM, songjinrong notifications@github.comwrote:

using pfor compression method is to store the array like(original-minimum) but why still need to shift like this: unsigned int shifted = vals[2] - vals[0] - (i%vals[1])*vals[0]; dest[i] = val << shifted;

— Reply to this email directly or view it on GitHubhttps://github.com/antonmks/Alenka/issues/14 .

songjinrong commented 11 years ago

it seems your compression method didn't consider about the situation that if the column have some number that is very large, for example a column number is range from 1 to 100000 but most of the numbers is range from 1 to 100 only several numbers is very large, in fact, pfordelta is used to handle this problem,but i looked into your pfordelta method ,i seems to deal with sorted column not this situation

antonmks commented 11 years ago

Correct. I do not have a different handling of exception values. If in future someone would need it I can implement it.