Closed kroese closed 11 months ago
Neat idea on the max DB size: The tough thing is figuring out a policy on which events to purge.
If you're OK with the time-based policy, you could put strfry delete --age $MAX_AGE_SECONDS
into a cron-job.
Great, did not know that command existed! That will do for now..
Actually it didn't went like I expected.
When running the delete
command which removed 5 million events, the database size was INCREASED with 5 GB afterwards. I am not sure how this works, maybe it causes some internal shuffling and lmdb needs to allocate more space to rearrange everything during a delete operation?
So next I did a compact
which luckily decreased the size finally. But this cannot be run from the daily cron because A) it takes hours B) it requires to take strfry offline because the database files need to be swapped.
I will just keep the daily delete
cron and hopefully it will keep the database the same size from now on. As lmdb is supposed to re-use the space from the deleted records for new data.
I think this behaviour makes sense: When you delete values from LMDB, it doesn't actually shrink the DB size, it just leaves it as free-space that will get used for subsequent writes.
So I think doing the compact one time is probably sufficient, and the DB size will grow up to the "high water mark" (the largest amount stored at any one given time), and stay there.
Yes. I didnt expect it to shrink it by the deletes, I was just surprised it growing so much by just deleting stuff.
Because of diskspace constraints I cannot store all incoming messages indefinitely, so it would be very handy if there would be setting that let you specify an amount in days (or even better: a max db size) that automaticly purges any messages older than that.
My current workaround is to automaticly delete the database as soon as it reaches 50 GB, but it would be nicer if it would be "rolling" so that the amount of history available always stays the same.