tspurway / hustle

A column oriented, embarrassingly distributed relational event database.
Other
240 stars 36 forks source link

Duplicate Adjacent Column Optimization #32

Closed tspurway closed 10 years ago

tspurway commented 10 years ago

In the marble's value db (the rowid -> value db for a column), we can compress the data by skipping consecutive rows that have the same value. For example, currently we store value dbs like this:

RID VID
1 12
2 12
3 12
4 15
5 12
6 12
7 18
8 18
9 18

with Duplicate Adjacent Column Optimization, this db becomes:

RID VID
1 12
4 15
5 12
7 18
ncloudioj commented 10 years ago

Added this feature in commit a60c741c2880f97a26d970c49f4379de3912e9a6