hapijs / catbox

Multi-strategy object caching service
Other
494 stars 72 forks source link

added dropSegment to client and policy #187

Closed jgallen23 closed 5 years ago

jgallen23 commented 7 years ago

Added support for dropSegment(). If this looks good, I'll add to docs too

jgallen23 commented 7 years ago

I updated the readme and also added dropSegment support to the policy as well

jgallen23 commented 7 years ago

Anything else need to be changed here? I'd also like to add a PR for client/policy.keys() to return all the keys in the cache, but waiting until this is merged.

kanongil commented 7 years ago

Thanks for taking the time to do this PR.

Unfortunately, I'm not inclined to merge this, as it is, since it doesn't actually solve the price example in #142. It is all to easy to have another instance cache a page with the old value, immediately after dumping everything.

jgallen23 commented 7 years ago

Therefore, I need to delete all search pages from cache, but there are other classes of pages I can leave alone.

Wouldn't dropSegment('search') achieve this? (or if it's on a method, server.methods.search.cache.dropSegment())

It is all to easy to have another instance cache a page with the old value, immediately after dumping everything.

I'm not sure I follow this. If the price was updated and cache is cleared, it should call generate again and the new price should be there. Is that right?

To be honest, I did the PR for my own use case and happened to see #142 when I was submitting, but happy to adjust to make all use cases work.

kanongil commented 7 years ago

The flow that fails is something like this:

  1. Client requests an uncached page with price.
  2. Server fetches the price and starts generating the response
  3. Price is changed and dropSegment is called to invalidate cache.
  4. Segment is dropped.
  5. Server finishes generating the response to the client, and saves it to the cache.
  6. The cache now contains an element with the old value.
benedikt-roth commented 7 years ago

@kanongil Isn't this the integration part already? The case of #142 might simply require the implementation of a more complex caching strategy to ensure cache consistency.

To be honest, I am still not convinced by dropping keys by name segmentation. Especially if people want to solve problems like in #142. Since this implies the usage of 1st searching for keys and 2nd iterating over them to drop them. The first part which is explicitly warned of in e.g. Redis documentation because it is prone to causing performance issues.

Warning: consider KEYS as a command that should only be used in production environments with extreme care. It may ruin performance when it is executed against large databases.

The abstraction of catbox will invite inefficient implementation that will bring performance issues that might be hard to identify.

Disclaimer: Redis is just an example. I do not have enogh knowledge about all the other storage technologies.

jgallen23 commented 7 years ago

So another approach would be having some sort of manifest that keeps track of the current version (or maybe a date) of each segment and then you increment that when you want to flush the segment. On the get call, we would need to check the manifest and if the versions don't match (if date, you just check if stored before segment stale date), you grab new data and if it does, you return from cache?

My concern with this approach is that it doubles the database calls on every get request

jgallen23 commented 7 years ago

Another approach would be to add some sort of hooking mechanism to the policy that could allow the app to decide if it should read from cache or not. That way catbox doesn't make any assumptions and if the app wants to implement some sort of versioning or just skip cache for anything stored after a timestamp. Could look something like:

shouldReadFromCache(key, cacheValue, storedDate, callback) {
  callback(null, true);
}

It wouldn't need any changes to clients and I think it would give us a few more interesting ways to invalidate cache.

kanongil commented 7 years ago

@jgallen23 Interesting approach. This effectively enables apps to implement whatever invalidation strategy best fits their requirements.

I would still like to explore more complete solutions as well, in case there is one that covers most / all use-cases.

In general, this issue can be split into two problems that needs to be solved: Invalidation, and purging. This PR handles the purging, but needs an invalidation mechanism as well to form a complete solution.

kanongil commented 7 years ago

Obviously, for catbox to handle invalidation, it will need to store a token in the key segment, key id, or raw payload (envelope). This will then need to be validated upon fetching.

For the token, there have been put forth two solutions:

  1. An integer that represents a "generation", which can be incremented by the app, causing entries with old values to be considered invalidated.
  2. A timestamp, where some sort of invalidate(since) call causes old entries to be invalidated. Unfortunately, we won't be able to use the existing stored property for this, since it signifies when the value is stored, and not when we started to generate it.

As to where to best store the value, the envelope seems to be the obvious choice, but it doesn't seem to be ideal for the purging step. Engines will either have to index this value, or iterate all elements to purge old values.

jgallen23 commented 7 years ago

@kanongil I can put together a PR for the invalidation hook if you'd like

kanongil commented 7 years ago

@jgallen23 I'm looking into it. I have an idea for a complete solution, with appropriate guarantees, and without extra lookup latency, which will work with all existing engines.

jgallen23 commented 7 years ago

@kanongil any updates on this?

andredlam commented 6 years ago

any updates? Thanks.

Y-LyN-10 commented 6 years ago

Looking forward for this functionality.

marcuspoehls commented 6 years ago

@kanongil Hey Gil, can you please share an update on this? I have a related open pull request over at catbox-redis: https://github.com/hapijs/catbox-redis/pull/69

If this PR here won't get merged, I would close the one on catbox-redis as well.

hueniverse commented 5 years ago

@kanongil it's been over a year - please close/merge/etc.

SimonSchick commented 5 years ago

@hueniverse is there any interest in the feature? I'm currently in a position where I'd like to be able to do this.

hueniverse commented 5 years ago

At this point I'm going to close this and restart the conversation.

lock[bot] commented 4 years ago

This thread has been automatically locked due to inactivity. Please open a new issue for related bugs or questions following the new issue template instructions.