rosedblabs / rosedb

Lightweight, fast and reliable key/value storage engine based on Bitcask.
https://rosedblabs.github.io
Apache License 2.0
4.48k stars 620 forks source link

有个问题,如果key过期了,不执行merge就永远不会删除文件是吗 #297

Closed EternalHunters closed 6 months ago

EternalHunters commented 7 months ago

我观察到日志文件每天都在增长,每天一个G,其实数据量应该没有这么大,这样下去不用多长时间,磁盘就要满了

roseduan commented 7 months ago

嗯嗯,目前是执行 Merge 了才会清理。 也可以自定义自动清理的逻辑,你这边啥使用场景呢

EternalHunters commented 7 months ago

嗯嗯,目前是执行 Merge 了才会清理。 也可以自定义自动清理的逻辑,你这边啥使用场景呢

场景就是拿来做本地缓存,例如一个用户数据缓存几分钟。过期后再次回源写入。 所以我使用rosedb时设置了ttl,没想到它不会自动清理文件,现在文件越来越大

roseduan commented 7 months ago

可以写一个定时任务,系统空闲时执行 Merge 操作

EternalHunters commented 7 months ago

可以写一个定时任务,系统空闲时执行 Merge 操作 目前看rosedb文档也只能这样了

LindaSummer commented 7 months ago

Hi @roseduan , I want to make an enhancement with a background goroutine and configured interval to do merge jobs. It should start after all initializations. Do we have a way to get metric to guess a more suitable time window to merge or just a fixed interval or cron-expression configured by customer is enough?

roseduan commented 7 months ago

Hi @roseduan , I want to make an enhancement with a background goroutine and configured interval to do merge jobs. It should start after all initializations. Do we have a way to get metric to guess a more suitable time window to merge or just a fixed interval or cron-expression configured by customer is enough?

Thanks.

I think we can add a simple solution first. A cron expression seems a good choice, which already contains the fixed time interval.

We can add some more complicated routines(metrics) if some users ask for it.

LindaSummer commented 7 months ago

Hi @roseduan , I want to make an enhancement with a background goroutine and configured interval to do merge jobs. It should start after all initializations. Do we have a way to get metric to guess a more suitable time window to merge or just a fixed interval or cron-expression configured by customer is enough?

Thanks.

I think we can add a simple solution first. A cron expression seems a good choice, which already contains the fixed time interval.

We can add some more complicated routines(metrics) if some users ask for it.

Thanks very much for your suggestion. Got it, I will make a cron-expression implementation.

Before coding I want to confirm our rules for third-party libraries.

Can I import an existed cron library or I should implement it from scratch?

roseduan commented 7 months ago

An existed cron library is enough.

LindaSummer commented 7 months ago

An existed cron library is enough.

Got it! Thanks very much for your patience and warm suggestions😁