caching-tools / next-shared-cache

Next.js self-hosting simplified
https://caching-tools.github.io/next-shared-cache/
MIT License
335 stars 24 forks source link

Some keys in sharedTags are never get cleaned up if redis is evicting keys (occured in redis-strings implementation) #852

Open tilman opened 2 weeks ago

tilman commented 2 weeks ago

Brief Description of the Bug Seems like sharedTag map contains several tags which are non existent anymore. In my case this resultet in a large sharedTag map of 150mb which got scanned on every revalidateTag slowing down redis a lot.

Here some numbers: Keys in db: 15031 Keys in sharedTags file: 184875 Keys in db but not in sharedTags: 270 Keys in sharedTags but not in db: 170114 --> only 10% in shared tags file is still relevant

Severity [Critical/Major/Minor/Cosmetic] Major

Frequency of Occurrence [Always/Sometimes/Rarely] Always

Steps to Reproduce Provide detailed steps to reproduce the behavior, including any specific conditions or configurations where the bug occurs:

1.) Start a redis with very low memory and lfu-volatile key eviction. 2.) produce a lot of cache entries 3.) check sharedTags

Expected vs. Actual Behavior sharedTags got so huge that it blocks all resources of redis during revalidateTag

Environment:

Dependencies and Versions List any relevant dependencies and their versions.

Attempted Solutions or Workarounds Describe any attempted solutions or workarounds and their outcomes.

Impact of the Bug Describe how this bug is affecting your work.

Additional context Provide any other information or context about the problem here.

tilman commented 2 weeks ago

I can create a PR for fixing it. What solution would you prefer?

tilman commented 2 weeks ago

I have made another test with a large DB so keys do not get evicted and the problem exists there as well. Not in such a huge dimension but still some keys in shared tags file are not valid anymore, which will accumulate over time. I think as soon as a key reaches it's TTL value it will get deleted from redis but not cleaned up from shared Tags.

tilman commented 2 weeks ago

I have now analyzed both solutions. Each will come with it's own drawbacks:

  • fetching keys in revalidateTag function and extending tagsToDelete with the outdated keys

Obviously this will make the already slow revalidate tag even slower by adding a keys scan operation. Even if this scan is way faster as the sharedTags scan itself it will still additionally slow it down

  • or starting a interval to do a cleanup routine every few hours

This will be hard to implement in serverless environments. I'm not sure what the focus of this package is. I'm using it with a Node-Server. Is there any amount of users using this with serverless environments as well? For server environments we can easily use setInterval for starting a cleanup routine. For serverless environments we would need to use something like unstable_after combined with a propability function to start a cleanup for example on every 500th request.

georgesouzafarias commented 1 week ago

Hey @tilman

We’re experiencing the exact same issue, significantly affecting our application’s performance by increasing latency and impacting the user experience. As a temporary solution, we’ve implemented a cronjob to clear the Redis database every hour. Initially, we assumed the issue was due to the Redis database size, so we set a default TTL to expire objects quickly, but this had no effect. After extensive troubleshooting, we found that uncleaned sharedtags were the root cause.

tilman commented 1 week ago

Jep, the missing cleanup of expired/evicted keys will grow sharedTags drastically. In our case it was 150Mb after 1 day. After we applied a fix to the redis-strings handler it is only 15Mb after 7 days. Here is the patch we are using right now: https://gist.github.com/tilman/d13271d0e0b8772dcd7d467846a17044

But in general we are working on a better implementation of redis-strings handler right now. Even with the patch it still has no optimal performance because for every revalidateTag call the full scan for sharedTags (+ newly for the fix also the keys) is performed.

The new handler will have the following features:

Let me know if you are interested in it, then I can try to bring this open source :)