speedb-io / speedb

A RocksDB compliant high performance scalable embedded key-value store
https://www.speedb.io/
Apache License 2.0
898 stars 67 forks source link

Add a backdoor for running compaction method on demand #728

Open bosmatt opened 10 months ago

bosmatt commented 10 months ago

Provide a backdoor for running the compaction method. This can be used to schedule the compaction method periodically by external application.

mjsax commented 10 months ago

Can we add kafka-streams label?

Guyme commented 10 months ago

Added. However - it has other usages outside of Kafka-Streams as well.

udi-speedb commented 10 months ago

@bosmatt - Could you elaborate on the motivation for this feature? There is manual compaction (now both blocking and non-blocking). Is it not sufficient? Why?

cadonna commented 10 months ago

One of our Kafka Streams users would like to schedule periodic compaction based on calendar dates and times. For example, they want to schedule compactions on the weekends and/or in the night hours when their system has less load. With manual compaction, we could theoretically also implement this calendar based compaction triggers in Kafka Streams, I guess. However, as @Guyme pointed out it might also be interesting for other users outside of Kafka Streams.

cadonna commented 10 months ago

Ah, wait. Reading again the title of the issue, I am not sure if we are on the same page. I assumed this feature is about RocksDB/Speedb triggering the compaction based on a calendar. However, the title suggest adding a method for runnig compactions. If it is about the latter than -- as @udi-speedb -- I am also wondering whether manual compaction might be sufficient.

bosmatt commented 10 months ago

Once there is a backdoor for running compaction you can run it periodically using external tools/scripts/code. @cadonna This indeed should give an answer to your request, and help others as well. Please share your thoughts on this solution.

cadonna commented 10 months ago

@bosmatt Do I understand you correctly that this backdoor would allow -- for example -- a operation system cron job to trigger the compaction? That would be different from running manual compaction by calling compactRange() from within Kafka Streams.

Originally, I envisioned this feature as a config in Speedb/RocksDB that allows to pass in a schedule (for example in cron format) that is used to trigger compactions at the specified times. WDYT?