danthegoodman1 / icedb

An in-process Parquet merge engine for better data warehousing in S3
https://blog.danthegoodman.com/icedb-v3--third-times-the-charm
Other
128 stars 5 forks source link

Merge and tombstone "bounds" #130

Open danthegoodman1 opened 2 weeks ago

danthegoodman1 commented 2 weeks ago

We could actually support concurrent merges and tombstone cleaning if we know the partitioning ahead of time (e.g. month), and can put "bounds" on the logic to say that parts can be merged [lower bound, upper bound) (exclusive to prevent conflicts).

These must be optional.

Then locks could acquire their part ranges, allowing for n concurrent merges since there would be no conflicts.

This could provide a HUGE performance boost for merging, and thus making queries on recent data much faster, esp a higher insert rates.