PowerDNS / weakforced

Anti-Abuse for servers at authentication time
GNU General Public License v3.0
124 stars 33 forks source link

[FEATURE] expiring old SSDB keys #360

Closed mgehlmann closed 10 months ago

mgehlmann commented 2 years ago

Is your feature request related to a problem? Please describe. I noticed that the weakforced memory usage seems to continuously creep up over days and even weeks even with short-lived StringStatsDB, and the reason seems to be that keys never completely expire from the SSDB unless the MaxSize is reached.

I've set up a "OneHourDB" SSDB with a large MaxSize:

local field_map = {}
field_map["diffFailedPasswords"] = "hll"
field_map["totalFailed"] = "int"
newShardedStringStatsDB("OneHourDB", 600, 6, field_map, 16)
local sdb = getStringStatsDB("OneHourDB")
sdb:twEnableReplication()
sdb:twSetMaxSize(50000000)

The memory usage of the SSDB keeps growing continuously for weeks even with a roughly constant usage profile, until eventually the MaxSize is reached. I did not find any details in the docs on how the expiry of the SSDBs works, but I played around a bit and it seems to me that the keys stay present in the SSDB long after the last window has seen an insertion, but all windows are all empty.

Wouldn't it make sense to have keys expire after the maximum TTL of the SSDB? It is not a big issue, but it seems there is already expiry mechanism in place for the MaxSize based on last-usage. This could reduce the memory footprint by quite a bit.

Describe the solution you'd like Unused keys expire after the maximum TTL of the SSDB instead of persisting with empty data.

Describe alternatives you've considered The obvious workaround setting a MaxSize that is within the bounds of your acceptable memory usage, and be aware that the memory usage will always reach the maximum eventually.

Additional context

neilcook commented 2 years ago

This behaviour is by design. Keys don't get removed so that they can efficiently be reused if needed. I don't think it's a good idea to remove keys unless there's a good reason to, because dealloc/alloc takes up a lot of CPU.

As you state, the maximum size in terms of numbers entries can be set - this is the way to control the ultimate size of the DB. The only thing I can think of is potentially interesting to implement is something to control the maximum size of DB in terms of RAM. That would give you what you want I think, which is to control the total RAM size of the DB.