danthegoodman1 / icedb

An in-process Parquet merge engine for better data warehousing in S3 with MVCC
https://blog.danthegoodman.com/icedb-v3--third-times-the-charm
Other
131 stars 5 forks source link

Option to disable tombstones? (never delete data, infinite point-in-time history) #99

Closed danthegoodman1 closed 2 months ago

danthegoodman1 commented 1 year ago

If someone wants to never delete data (have full point-in-time access), then we can disable the creation of tombstones and never tombstone clean.

This means that merges create new files that omit what would normally be given a tombstone (log tombstones and file markers with tombstones), and log files are never cleaned up.

This might need to be paired with something like persistent reader instances that can just keep up with the state in the log from S3 changefeeds or something rather than having to read the log in every time.Or they can just list the log from the last file they know about (StartAfter key in list call) and materialize the log state locally or something. Maybe store materializations of the log back in S3 to start from as a checkpoint too, but that's outside the scope of what icedb would need to implement.

Or is this just entirely outside the scope of what icedb offers?

danthegoodman1 commented 2 months ago

you don't really need to disable to get this.

You can just never tombstone clean, and you can just set the log timestamp to some point in time in the past