logv / sybil

columnar storage + NoSQL OLAP engine | https://logv.org
https://logv.org
Other
305 stars 26 forks source link

investigate nested blocks (or megablocks) #31

Open okayzed opened 6 years ago

okayzed commented 6 years ago

currently blocks are capped at 65K records - i think it might make sense to create larger blocks (perhaps up to half a million or million records) to save on repeated string hashing operations when doing GROUP BY operations in a query.

one way of doing this is creating megablocks: a megablock would be a block consisting of multiple sub-blocks that are fully defined. further thinking required on how to use a megablock to reduce the number of string hash operations required.