DiceDB / dice

DiceDB is hyper-optimized for building and scaling truly real-time applications on modern hardware. It is a drop-in replacement for Redis with support for SQL-based reactivity.
https://dicedb.io/
Other
3.33k stars 420 forks source link

Add support for LFU eviction using Approximate Counting #10

Open arpitbbhayani opened 1 year ago

souravendra commented 1 year ago

@arpitbbhayani can you please explain this a bit more? I'd like to pick this up as it sounds very interesting and I could possibly learn a lot from it.

Rits1272 commented 4 weeks ago

@arpitbbhayani would like to take this up.

Rits1272 commented 4 weeks ago

@JyotinderSingh can I pick this up?

JyotinderSingh commented 4 weeks ago

@JyotinderSingh can I pick this up?

This is a relatively advanced feature, please go through the discussion which happened on the associated PR about the memory requirements.

Once done, let me know on this issue if you would like to try this.

Rits1272 commented 4 weeks ago

@JyotinderSingh this seems interesting. I have gone through the discussion that happened in the previous PR. The gist is to implement LFU with highly optimal space complexity.

Also, since Dice is multi-threaded, we would want concurrency support

I am going through how other popular in-memory data stores like Redis and Caffeine (tinyLFU) have optimized their implementation on this and will come up with a draft PR for Dice after some research

JyotinderSingh commented 4 weeks ago

@JyotinderSingh this seems interesting. Have gone through the discussion that happened in the previous PR.

The gist is to implement LFU with highly optimal space complexity.

I am going through how other popular in-memory data stores like Redis and Caffeine (tinyLFU) have optimized their implementation on this and will come up with a draft PR for Dice after some research

Awesome, assigning to you.

Rits1272 commented 3 weeks ago

@JyotinderSingh, here's approach I am considering:

LFU Cache Mechanism: We'll implement an LFU cache similar to Redis, leveraging the existing LastAccessedAt field in our codebase here. This field will be augmented with a counter that tracks the frequency of key accesses.

Logarithmic Counter: The counter would be based on log counter for approximate counting measure, inspired by Redis's implementation. This will help manage the counter size efficiently.

Counter Allocation: The counter will occupy the first 8 bits of the lastAccessedAt field, while the remaining 24 bits will continue to store the UNIX timestamp, as currently done here. This repurposes the first 8 bits, which are currently unused.

Decay Mechanism: We can also introduce decay mechanism to naturally age keys over time to improve efficiency

This approach does not increase space complexity and avoids the need for additional data structures. Let me know your thoughts on this

If everything good, will start working on its implementation

Rits1272 commented 3 weeks ago

Gentle nudge ^ @JyotinderSingh

JyotinderSingh commented 3 weeks ago

@JyotinderSingh, here's approach I am considering:

LFU Cache Mechanism: We'll implement an LFU cache similar to Redis, leveraging the existing LastAccessedAt field in our codebase here. This field will be augmented with a counter that tracks the frequency of key accesses.

Logarithmic Counter: The counter would be based on log counter for approximate counting measure, inspired by Redis's implementation. This will help manage the counter size efficiently.

Counter Allocation: The counter will occupy the first 8 bits of the lastAccessedAt field, while the remaining 24 bits will continue to store the UNIX timestamp, as currently done here. This repurposes the first 8 bits, which are currently unused.

Decay Mechanism: We can also introduce decay mechanism to naturally age keys over time to improve efficiency

This approach does not increase space complexity and avoids the need for additional data structures. Let me know your thoughts on this

If everything good, will start working on its implementation

Sounds great! Work on decay mechanism can be done in a separate PR. Let's first focus on getting a stable and solid LFU implementation going. Your plan sounds good, please proceed with it. Feel free to raise a draft PR while you work on it.