Question: Are Memory Cache Updates Shared Between Multiple Nodes?

bbrandt commented 3 years ago

Thanks for sharing your amazing library!

Pretend Scenario:

I have FusionCache configured with a secondary cache. I am running multiple instances/nodes of my ASP.NET Core web application. I have a value I am caching for 10 minutes.

In Node A this value is updated from X to Z and I call FusionCache's SetAsync() to update the value in the primary and secondary cache. In Node B the old value, X, was read and cached 1 minute before it was updated in Node A.

Question:

If GetOrSetAsync() is called 2 minutes later in Node B, will it still return the old value, X, from it's primary cache? Or, is there some magical mechanism to push primary cache invalidations across all nodes?

"Don't do that" is a perfectly valid response. Open to any insights on how to handle this corner case or how to avoid getting into this case. Thanks!

jodydonetti commented 3 years ago

Thanks for sharing your amazing library!

Thank you for taking the time to try it!

To answer your question: yes, right now Node B would still return the old value.

In fact right now I'm working exactly on this 😄, a way to notify other nodes of changes/evictions/etc happened somewhere else.

There are different ways of doing it, like send the new value to every node, send only a change notification, react to that from the other nodes with an immediate refresh or wait for the next usage request, etc: all of them have pros/cons and I'm evaluating exactly what to do and trying to carefully design the related api surface area to make it a pleasure to use with a minimum effort.

There's one thing though that would render this requirement - and also the need for the currently missing sliding expiration support - probably less needed a good chunk of the time and I'd like to share it with you.

You see, it's pretty common for a cache entry duration to be higher when the factory (loading logic) is potentially heavier/slower, so that it would happen less frequently, so that in turn it would be less frequent to get an error and break the "refresh loop" (save -> expire -> refresh -> save -> etc...).

What FusionCache I think can bring to the table is a different way of thinking about it, and if you will a more "liberating" one: if you just set a lower duration (even 10-30 sec instead of 10 min) and enable fail-safe + soft timeouts you will basically have a constantly updated cache without ever being slowed down by a factory that is updating an expired value and suddenly hangs or is super slow.

This is one of those situations where in theory the notification feature is absolutely correct and important (and as I said I agree, and I'm working on it) but at the same time, in practice, there's a different very simple way to do it that most of the times and for a lot of people/scenarios it's actually simpler and gets the job done.

jodydonetti commented 3 years ago

Another feature I'm thinking about, which would be useful in your situation, is to automatically shorten the memory cache duration if there's also a distributed one.

The rationale is that in this way we keep doing SetAsync() with a duration of 10 min and the data would be saved in the distributed cache with a logical duration of said 10 min but in the memory cache it would be something way lower (configurable, maybe like 10 sec or like 10% of the specified duration, or something like that) so that it will expire more frequently in each node's local cache but it would still be kept for 10 min in the distributed one.

Of course if a node would set a new value, it would go immediately to the distributed one ready to be read after the (now way shorter) expiration in each node's local cache.

jodydonetti commented 3 years ago

So to recap:

right now you can try shortening the duration and enable fail-safe + soft timeouts: it's not the absolute perfect solution but in my experience with a low duration it gets the job done most of the time.
I may decide to add the feature to automatically shorten the memory cache durations if there's a distributed cache, but I still don't know if I'll do it or not because I don't want to add extra stuff if it's not needed and in case I'd like to design it in the right way
I'm working on the notifications feature but again, to get the right design and perf tuning I need some time and I don't have an ETA right now

Hope this helps.

bbrandt commented 3 years ago

Thanks for the detailed response and sharing information about future plans!

There are no "easy" answers when it comes to caching, but the fail-safe and soft timeouts seems like really great features to have. Subscriptions sounds like a very challenging problem, with many time consuming edge cases, like building your own distributed database.

Letting the user configure a factor to shorten the memory cache duration when there's a distributed cache, but I can't think of a way to make it ultra-clear that .SetDuration() would then be setting the distributed timeout instead of the memory cache duration. Maybe if there was a separate .SetDistributedDuration(...) that the user would call in this case then it would be clear that .SetDistributedDuration(...) would set the distributed cache's duration and the memory cache duration would be set automatically. Then I guess if someone wanted ultimate power, they could call both:

product = cache.GetOrSet<Product>(
    "product:123",
    _ => GetProductFromDb(123),
    options => options
        .SetDuration(TimeSpan.FromSeconds(10))
        .SetDistributedDuration(TimeSpan.FromMinutes(10))
);

But, like you mentioned, it may be unnecessary clutter in the API which also means extra complexity, places for bugs to hide.

Thanks for the insights!

jodydonetti commented 3 years ago

Hi @bbrandt , just wanted to give you a quick update about this: see #11

ZiggyCreatures / FusionCache

Question: Are Memory Cache Updates Shared Between Multiple Nodes? #6

Pretend Scenario:

Question: