rosedblabs / rosedb

Lightweight, fast and reliable key/value storage engine based on Bitcask.
https://rosedblabs.github.io
Apache License 2.0
4.58k stars 632 forks source link

NEW: Periodically remove expired keys from memory #272

Closed Jeremy-Run closed 1 year ago

Jeremy-Run commented 1 year ago

There may be expired keys in the index, after adding PutWithTTL(). In addition to deleting these keys when Get() / Exist() / Ascend() ..., a goroutine can also be set up to delete them periodically.

Jeremy-Run commented 1 year ago

@roseduan Do you think this feature is necessary?

roseduan commented 1 year ago

Yes it is necessary, but we need to rethink the specific strategy, may be we can see how Redis does.

Jeremy-Run commented 1 year ago

@roseduan Is this plan ok?

Points to consider:

Implementation plan:

roseduan commented 1 year ago

Why do we need an ordered map to store the key and expiration?

I think the easiest and most straightforward way to do this is to iterate all keys in BTree and delete all expired keys.

We can add a public function to callers(like DeleteExpiredKeys), they can call it as their needs.

And we should clarify that this method may take a long time to finish, we can do some other optimizations when someone acquires for it.

Jeremy-Run commented 1 year ago

My plan is to delete expired keys without stopping DB service as much as possible (because this is an optimization function, not a core function); If it is allowed to stop the service for a long time, it is a good idea to iterate all the keys of the BTree to delete the expired keys. Of course, executing Merge(true) or reopening the db can also filter the expired keys.

The reason for using an orderedMap for storage is mainly due to two considerations: ①The key in the map is unique, which is unified with our db. ②It is in order, so it's possible to know the next execution time of the background goroutine and avoid the time of invalid write lock.

Of course it takes more memory, and written data that have expiration time will be slower. As you said, we can make a simple mechanism (scan BTree) first, and then optimize it according to the needs if there are user needs later.

roseduan commented 1 year ago

OK, we can just make a simple mechanism, to scan the btree to delete expired keys. And we can add a parameter to limit the execution time of the function, to avoid locking too long time.

Thanks for your recent work.