Snapchat / KeyDB

A Multithreaded Fork of Redis
https://keydb.dev
BSD 3-Clause "New" or "Revised" License
11.51k stars 578 forks source link

[BUG] Flash Expiry Incorrect Logic #749

Open keithchew opened 11 months ago

keithchew commented 11 months ago

Describe the bug

With a new DB:

set test 222 EX 5

Logging the keys from expire.cpp, ie:

keys = db->getStorageCache()->getExpirationCandidates(ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP);
  for (std::string key : keys) {
      serverLog(LL_WARNING, "getExpirationCandidates [%s]", key.c_str()); <---- ADD THIS

The output will be:

43280:43377:M 27 Nov 2023 09:47:02.551 # getExpirationCandidates [test]
43280:43377:M 27 Nov 2023 09:47:02.651 # getExpirationCandidates [test]
43280:43377:M 27 Nov 2023 09:47:02.752 # getExpirationCandidates [test]

After 5s, the logs will stop, as the key has been removed from the expiry storage. All good.

Now, run this:

set test 222 EX 5
set test 222

After 5s, the logs keep outputting, ie the expiry entry is stuck.

...
43383:43420:M 27 Nov 2023 09:56:04.344 # getExpirationCandidates [test]
43383:43420:M 27 Nov 2023 09:56:04.445 # getExpirationCandidates [test]
43383:43420:M 27 Nov 2023 09:56:04.445 # getExpirationCandidates [test]
43383:43420:M 27 Nov 2023 09:56:04.545 # getExpirationCandidates [test]
43383:43420:M 27 Nov 2023 09:56:04.545 # getExpirationCandidates [test]
43383:43420:M 27 Nov 2023 09:56:04.646 # getExpirationCandidates [test]
43383:43420:M 27 Nov 2023 09:56:04.646 # getExpirationCandidates [test]
...

No matter what we do after this, eg

set test 222 EX 5
del test

The expiry entry is still stuck. It gets worse, looking at the logic in expire.cpp:

   keys = db->getStorageCache()->getExpirationCandidates(ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP);

In each loop, it will looking 64 keys to expire. If we repeat the process above for more than 64 keys, we will prevent the loop from removing new expiry entries!

keithchew commented 11 months ago

I can confirm that there are (at least) 2 scenarios not handled by the current expiration logic:

In both cases, the logic needs to remove the key from the expiry storage.