logv / sybil

columnar storage + NoSQL OLAP engine | https://logv.org
https://logv.org
Other
305 stars 25 forks source link

Add block skipping for string filters #114

Open okayzed opened 4 years ago

okayzed commented 4 years ago

If filtering to a particular string and the block doesn't contain that string, we can skip aggregating that block. This might help certain use cases for redbull.

Basically, we would prioritize unpacking that string column first and then check filter against the string table.

This may or may not work well.

okayzed commented 4 years ago

evan suggests we can do this using bloom filters and hierarchical bloom filters. this will work for equality but not regex, as far as i can tell

okayzed commented 4 years ago

An initial implementation now exists for simple equality on strings on a per block basis, but it is not using bloom filters