martinsumner / leveled

A pure Erlang Key/Value store - based on a LSM-tree, optimised for HEAD requests
Apache License 2.0
352 stars 32 forks source link

Drop Bucket or Drop Key Range (perhaps Drop Modified Range) #235

Open martinsumner opened 5 years ago

martinsumner commented 5 years ago

A facility to drop a bucket, or drop a key range in a bucket, or drop a modified date range in a bucket.

The outline idea here is that a drop request would be a special type of key change. The change would be written to the journal, and then the drop would be appended as metadata to the Bookie's ledger cache, and then forwarded to the penciller. The penciller would append the drop request to its cache, and then request an update to the clerk.

The clerk would inform all SST files in the range of the drop of the drop, and then update the manifest so that the entry for each SST file has an indication of the drop.

Requests to the ledger cache will be filtered based on the drop (for the duration of that cache). SST files would also filter out all data they returned based on the drop.

As caches are merged into the LSM tree, new caches need not be aware of the drop. As new SST files are created they need not be aware of the drop. Eventually memory of the drop is forgotten, but as long as a file or cache contained information about the drop, until the file is deleted by a merge event it remembers the drop and filters any result it outputs.

On restart, the manifest remembers the drops, and informs the leveled_sst actor as the file is started. Where new caches are built from the Journal, the persistence of the drop in the Journal will re-apply the drop at the same point.

There are issues that need to be resolved. Most notably the problem of what happens if the pclerk does not get the drop message before a shutdown, and general race conditions between the pclerk and the penciller.

There is going to be a significant amount of change. However, having a drop may make life easier - and may stop people from using features that have their own overheads (TTL objects, multi-backend). For the Riak implementation, the drop process will need a broader safe process e.g.:

martinsumner commented 5 years ago

One thing that needs to be considered is the need to drop index entries. So if a key_range is dropped, it is not enough to inform sst files in the penciller that cover that key range, other sst files may have relevant index entries too.

This makes LMD a bit harder, as index entries don't currently include LMDs