jodydonetti commented 2 years ago

The Idea

The next version of FusionCache will have an important new component: a backplane.

In the design phase and while discussing it with the community (@jasenf and @sanllanta in particular) a question arose: would it be possible to use just a memory cache + a backplane, without having a distributed cache?

We'll use this issue to discuss the problems with this approach, tentative ideas around it and potential solutions.

So the question is: is this possible?

Short Answer

~~No.~~ Well, maybe. With some limitations, but maybe. See this thread.

Longer Answer

This idea in fact seems like a nice one!

In a multi-node scenario we would like to use only the memory cache on each node + the backplane for cache synchronization, without having to use a shared distributed cache.

Technically you can in fact setup a FusionCache instance without a distributed cache but with a backplane.

~~But don't do it ⛔.~~

But Why?

You see, the problem with this approach is that it will continually evict cache entries, all the time, on all nodes basically overloading your datasource (eg: the database).

This is because every time a cache is set or removed, it will automatically send eviction notifications on the backplane, which in turn will evict local caches on the other nodes, which in turn - the next time someone asks for that same cache entry on those other nodes - will set the cache, sending notifications and so on, going on like that forever.

Example

To better illustrate this scenario imagine a multi-node setup with 3 nodes (N1, N2, N3), each with a memory cache, initially empty:

somebody on N1 calls GetOrSet for "product/123"
nothing is there, so it calls the factory to grab the product from the database
it then saves the product in the memory cache (on N1) for 5 min, and notify everybody about the change
N2 and N3 receive the notification, and evict their local cache for "product/123"
then somebody on N2 calls GetOrSet for "product/123"
nothing is there, so it calls the factory to grab the product from the database
it then saves the product in the memory cache (on N2) for 5 min, and notify everybody about the change
N1 and N3 receive the notification, and evict their local cache for "product/123"
and so on...

As you can see this basically means that every time somebody directly SET a cache entry (eg: when calling the Set method) or call a GetOrSet (logically a GET + a SET) the entry will be evicted, rendering the entire thing useless.

A different approach

One idea we may think about is to send notifications only after a Remove call or a Set call, and not when calling GetOrSet: the problem now is that, apart from being not logical (a GetOrSet is a GET + SET and the SET part is logically the same as the one in a Set method call), it would also end up NOT keeping all the caches synchronized.

Example

Why? Let us follow this scenario:

somebody on N1 calls GetOrSet for "product/123"
nothing is there, so it calls the factory to grab the product from the database, as it is right now
it then saves the product in the memory cache (on N1) for 5 min, without notifying everybody about the change (since it is not a Set method call)
the product is updated, outside of the app(s) that are using FusionCache (eg: a background service, another app in another programming language, etc...)
then somebody on N2 calls GetOrSet for "product/123"
nothing is there, so it calls the factory to grab the product from the database, as it is right now
it then saves the product in the memory cache (on N2) for 5 min, without notifying everybody about the change (since it is not a Set method call)
now, for around 5 min, N1 and N3 will see different versions of "product/123"

As you can see there's no way to escape this, at least that I'm aware of.

A different approach (reprise)

Finally, in theory we may say that if we establish that ALL changes to the data are done via a piece of code that uses FusionCache, and ALL of those changes to the database are ALWAYS followed by a direct Set call and we ONLY consider the direct Set calls (+ the Remove ones) to send notifications then yeah, maybe it should work.

Well... yes, maybe, in theory that would be the case, but it would also be a very a brittle system, IMHO.

So What?

Anyway, I'm absolutely open to new ideas or point of views.

If you have a brilliant proposition that works and is not brittle please let me know so we may be able to work something out!

tcsaddul commented 2 years ago

Based on my experience on using a real-time backplane, there is no need to send notifications in GetOrAdd. I only send notifications for Update and Delete methods.

jodydonetti commented 2 years ago

Hi @tcsaddul , thanks for chipping in!

I would like to understand more about your experience: case in point, the example I made above in the "A different approach" part, how would that work out in your case?

ps: I've in fact not mentioned Remove operations, but of course those are included in the notifications. I've now updated the issue to explicitly include those.

jasenf commented 2 years ago

Hi --

We already use Redis a simple message broker for cache invalidation across nodes, it works wonderfully. The cache just has toi support some basic operations that are on the implmentor to use properly.

@tcsaddul Only messaging on Update or Delete at the cache level won't work, because a data update may occur but the contents may not be in the local cache, so an Add would take place rather than an Update. However, in another node, that data may already be there. If you don't notify on an Add the data on other nodes could be stale.

@jodydonetti I believe what you are saying resembles what I would consider the proper way to implement, but it's confusing because we tend to think of Cache transactions separately from data operations. Mostly in the way cache implements merged GetSet operations.

We use a similar cache library that has full support for Redis pub/sub as just a backplane/messaging broker to invalidate other nodes. It only sends notifications to other nodes on local cache Updates or Deletes. But that means it suffered from the scenario I mentioned above to @tcsaddul. We figure this was still a 95% solution if we could solve that 1 issue. Ultimately it comes down to implementors knowing when a data Update or Delete has occurred (in the database). We just added support to the cache library for telling it when content was being Updated. Internally this_cache.Update(key) became just a command to invoke invalidation across all the nodes.

Introducing this into our code base was super easy, as I would expect it to be for most, as most CRUD transactions are isolated in some repository layer. For us in fact, we integrated this concept into the wonderful EFCoreSecondLevelCacheInterceptor project which was awesome because the cache became integrated deeper, right in to DbContext. Since EfCore knows when Add/Update/Deletes are occurring already, tying this all together as interceptors was mostly painless.

For us the biggest issue was the cache partitioning (as I mentioned in the other post that you closed ot). But cache invalidation across nodes without a secondary storage (i.e. using Redis as a messenger not a data source) was a godsend. We were simply caching too much data and the content was too big. Pushing all the data up to different cache tiers was a performance nightmare. But pushing around little invalidation messages has been working elegantly.

jodydonetti commented 2 years ago

@tcsaddul Only messaging on Update or Delete at the cache level won't work, because a data update may occur but the contents may not be in the local cache, so an Add would take place rather than an Update. However, in another node, that data may already be there. If you don't notify on an Add the data on other nodes could be stale.

Just to clarify: in FusionCache there are no Add vs Update methods, there's only a Set method.

This design decision has been taken because, in my experience at least, what you want to do is to "set" something in the cache, not caring if it was there or not. Also, being FusionCache a hybrid cache (so either single or multi-level), to know if something "is in the cache" you would have to ask both local (memroy cache) and remote (distributed cache), and that is an expensive operation. Typically when having different Add / Update you end up having an AddOrUpdate method anyway, to basically do a Set, so I decided to just have a Set and be good with it.

In closing I think what @tcsaddul was referring to was what you would think about as Add + Update + Delete.

jasenf commented 2 years ago

We may be over complicating the problem space. It can be summarized by the typical design decision of cache: being agnostic of the type of transaction causing the cache addition or update.

As we are seeing a distributed cache needs to know when to invalidate updates. The only way to do this is explicitly from the implementator. Either add a optional TransactionType to each cache operation

.getOrSet(key, value, [add, update, delete])

Or just give the implementor the ability to invalidate their own keys when they do a data update.

The cache should then just publish an invalidation to other nodes on updates/deletes/invalidate

Why is caching still so difficult? Haha

jodydonetti commented 2 years ago

Ah! This is an interesting angle I haven't thought about.

It would be something like having the ability to specify the "mode" in which the backplane should work: let's say "active" vs "passive".

Right now the backplane I've implemented does 2 things, togheter:

it listens for local "change" events (set + remove, including a set during a GetOrSet call, and including adding stuff, which in FusionCache is still a "set" operation) and reacts to those by pushing notifications out for the other nodes
it listens for remote notifications comin in from other nodes and reacts to them by doing a local eviction

But if we define an "active" vs "passive" mode things may look like this:

ACTIVE MODE: listen to local change events and push notifications out + listen for remote notifications and evict locally
PASSIVE MODE: only listen for remote notifications and evict locally

To support sending of notifications in passive mode I would have to add a new core method to the cache, something like NotifyChange(string key) or something, so you can explicitly call it only when you want.

So in passive mode basically you would be "on your own" about notifying everybody about any change happened, but at least it would let you get the scenario you mention (memory + backplane without dist. cache) covered.

Would something like this make sense to you and cover your needs?

jodydonetti commented 2 years ago

ps: "active" VS "passive" is the first thing that popped up in my mind, but it may be modelled as 2 bool flags, like bool ListenForLocalEvents + bool ListenForRemoteEvents or something like that.

jasenf commented 2 years ago

Just my two cents:

If this were my library, I wouldn't change all that much. Simply knowing if there is a backplane enabled should be all the configuration that is needed. From then:

Invalidate all other nodes when an item is manually removed from the implementor. _cache.DeleteEntry()
Invalidate all other nodes when the cache is internally replacing an existing item (i.e. since the cache implicitly knows its an update)

Functionally this is all that's needed. Whenever the implementor needs to do a data UPDATE they could just call _cache.DeleteItem(). For the sake of consistency, I would probably create a facade method like _cache.UpdateItem() which replaced the local cache item and caused other node invalidation.

I could be totally wrong, but I think this covers everything. Unfortunately I don't see or think there's a way to this automactially since a distributed cache needs to know specifically about Delete and Update operations.

tcsaddul commented 2 years ago

I agree with @jasenf. This is the reason we notify only on Update and Delete. Here are the reasons:

Each node have their own local cache
With GetOrSet (GetOrAdd), if the key does not exist, it is assumed that it will retrieve the latest data from source when it is needed.
An Update operation must notify the other nodes of the updated data so that all nodes have the latest data. Note that we don't use Delete here because it will trigger reloading of data from the source for the other nodes which is expensive.
A Delete operation is normally used when the Update does not have all the data needed to be updated which is why we just simply Delete and notify the others to also Delete so that they will refresh their data when needed.

tcsaddul commented 2 years ago

I would like to emphasize in 3) that the Update operation will replicate the "Value" to the other nodes which means we don't notify the other nodes for them to Delete their key though this can also be done. The idea here is that the notification event contains the latest data (as a byte array - we use MessagePack here to serialize/deserialize) which means the latest data is replicated to the other nodes so that the other nodes need not refresh their data anymore from the source.

jasenf commented 2 years ago

At best data replication should be optional. In most cases it is not optimal or preferred. Serialization and deserialization of content is very heavy. Invalidation and forcing nodes to go back to data source is typical.

Imagine a cache implementation used for dotnet response caching. We are holding the entire page in the cache.

tcsaddul commented 2 years ago

When we know that the data is considered large (contains List or Dictionary), we issue the "Delete" operation or the "Delete" notification instead as stated in 4). I guess we better give the option to the programmer to override the default behavior. Replicating the data will minimize data refresh from source and also provide ready/faster access to the data. Moreover, sending the notification internally contains some data already (headers, Id, etc.) so adding less than 500 bytes should have very little impact.

Here is a sample from our code:

   public static async Task UpdateObject<V>(TableIndex objectIndex, uint id, V value)
    {
        ObjectCache.AddOrUpdate<V>(new LookupObject
        {
            Id = id,
            ObjectIndex = objectIndex,
        }, value);

        if (_withCacheClientHub)   // Send the Delete notification to the other nodes
            await _cacheClientHub.DeleteObject(objectIndex, id);
    }

jodydonetti commented 2 years ago

Thanks all for your inputs, I think I have to clarify a couple of things, since all of us have different backgrounds, use a different lingo and whatnot.

In FusionCache there are 2 actions to change the data in the cache: SET and REMOVE, that's all.

There's no Addvs Update, only SET, which conceptually is similar to a dict["key"] = value when using a dictionary. The data is already there? It's overwritten. Not there? It's added. This is important because a cache is by definition potentially transient/volatile: data may disappear due to various reasons based on different factors, different implementations, etc. Eg: for a distributed cache there are things like the Redis/Memcached server restarting, for a memory cache it can be a cold start or there can be a memory pressure logic and some entries have been evicted, and so on.

Regarding the GetOrSet method, I saw some @jasenf had some confusion about it, so I want to clarify. Pretend it's not there, for a moment. How would you get something from the cache and, if not there, put it in the cache for later reuse? I assume something like this (pseudo code):

VALUE = GET FROM CACHE
IF (VALUE NOT THERE) {
  VALUE = GET FROM DATABASE
  SET VALUE INTO CACHE
}
RETURN VALUE

That's it.

The problem with doing it manually is that it's not optimized for highly concurrent scenarios and can lead to problems like Cache Stampede. To avoid that there's the GetOrSet method that would coordinate the "GET FROM DATABASE + SET VALUE INTO CACHE" part atomically. That's all. If I'm missing something please tell me.

On my verbosity 🙄

Finally, we all use different terms for different things, like "set" / "update" / "change" with possibly different meanings. For example the phrase "on an update" may mean "the data is changed in the database" or "the data is mutated in the local cache" (which as I said above is not something we can actually know) or "the data is changed on another node" or "I changed the data in the cache" (a Set method call in FusionCache) and so on. The topic is already complicated, the scenarios are multiple and different and I don't really have a solution to this communication problem if not to just tell you I may not always get exactly what you mean and, in my case, I try to be as explicit as possible, sometimes probably ending up being too verbose or seemingly pedantic, but I do this to try to avoid this confusion. Best.

jodydonetti commented 2 years ago

@tcsaddul

I would like to emphasize in 3) that the Update operation will replicate the "Value" to the other nodes which means we don't notify the other nodes for them to Delete their key though this can also be done. The idea here is that the notification event contains the latest data (as a byte array - we use MessagePack here to serialize/deserialize) which means the latest data is replicated to the other nodes so that the other nodes need not refresh their data anymore from the source.

I'm missing something here: if you have only a memory cache, with live object instances, and receive a byte array from another node, how do you know which clr type to use for the de-serialization?

jodydonetti commented 2 years ago

@jasenf

At best data replication should be optional. In most cases it is not optimal or preferred.

Totally agree.

Serialization and deserialization of content is very heavy. Invalidation and forcing nodes to go back to data source is typical.

Yep, also because a cache entry may not be needed on all nodes, so it would be useless.

jodydonetti commented 2 years ago

So, a quick recap of which features to ideally support (I don't know if possible, at least in the v1):

automatic eviction notifications (only key) to other nodes, triggered on SET/REMOVE actions (the current design)
manual eviction notifications (only key) to other nodes
manual replication notifications (key + data) to other nodes

And what this would boil down to:

you want all automatically managed? you must have a distributed cache
you don't want a distributed cache? you must send notifications manually

Does this make sense? Does it cover your scenarios?

Thanks!

jasenf commented 2 years ago

Hi,

The problem is that we are conflating what we want the cache to do from the point of view of someone who's implementing it, and also discussing the internals at the same time, so vocabulary gets munged.

@jodydonetti questions: On your first bullet pint, when you say "SET" do you mean only when an internal replacement of cached data happens? if yes, then we are on the same page. If you mean eviction to happen whenever content is simply added to the cache, then I think we are not quite on the same page.

I would personally stay away from point #3, I've never seen a cache implementation try and attempt that. It's going to be a nightmare. Better to just have a secondary cache (Redis) than pushing that content around or trying to orchestrate those pushes.

Also, not quite sure I understand your last two bullets.

jodydonetti commented 2 years ago

The problem is that we are conflating what we want the cache to do from the point of view of someone who's implementing it, and also discussing the internals at the same time, so vocabulary gets munged.

Agree. At the same time though, to explain why a feature is complicated or not possible, I need to explain how it works internally. Not always easy, I agree 😅

@jodydonetti questions: On your first bullet pint, when you say "SET" do you mean only when an internal replacement of cached data happens? if yes, then we are on the same page. If you mean eviction to happen whenever content is simply added to the cache, then I think we are not quite on the same page.

As said previously, there's no Add vs Update, so a SET covers both and is intended as what happens when you "tell the cache you want the value Y for the cache key X", independently of the fact it was already there in the cache or not, or that it was there but with a different value.

I would personally stay away from point #3, I've never seen a cache implementation try and attempt that. It's going to be a nightmare. Better to just have a secondary cache (Redis) than pushing that content around or trying to orchestrate those pushes.

Yes, me too, totally. But I was collecting the desired requirements and @tcsaddul seemed to have this need so I've listed it. But yeah, I would avoid it probably.

Also, not quite sure I understand your last two bullets.

I tried to summarize the reasoning. Expanding it more:

you want all automatically managed? you must have a distributed cache: this means that you just put stuff in the cache (via Get/GetOrSet) or remove stuff from the cache (via Remove), and the synchronization with the other nodes happens automatically, you don't have to explicitely say anything. But for this to work, a distributed cache is needed
you don't want a distributed cache? you must send notifications manually: this means that if you don't want to have a distributed cache, automatic synchronization cannot work well (for the various reasons highlighted above), so you must manually send eviction notifications, because without an external shared state (the distributed cache) only you will know when something is effectively changed and the other nodes need to eventually load new data from the database

Now though I have a question for you @jasenf , regarding this:

@jodydonetti questions: On your first bullet pint, when you say "SET" do you mean only when an internal replacement of cached data happens? if yes, then we are on the same page. If you mean eviction to happen whenever content is simply added to the cache, then I think we are not quite on the same page.

My question is: when putting something in the cache for the key "foo", why it is so important for you to know if something was already there or not (so the difference between replacing and adding something to the cache, like you said)?

I don't understand this point, and I have a hunch that you would use that information on one node to kind of deduce the state of the cache on the other nodes (which cannot possibly work).

jasenf commented 2 years ago

As said previously, there's no Add vs Update, so a SET covers both and is intended as what happens when you "tell the cache you want the value Y for the cache key X", independently of the fact it was already there in the cache or not, or that it was there but with a different value.

Yes, I understand, But from the implementors perspective we can't have a global multi-node cache invalidation simply because a new value is placed into the cache. I was just saying that, behind the scenes, the implementation needs to not push a distributed invalidation on add, only on replacement of an item you are already caching (all transparent to the implementor because we never really know if it's an add or update). If you are saying that the cache implementation can't discern this, then it probably should never do the invalidation broadcast ever unless the implementor requests it.

I'm sorry for not being clear, I the implementor doesn't care about whether it's an add or an update. But I do care that I am not causing a cascading invalidation. All I am saying is that, behind the scenes, within the cache provider, if you are replacing an entry, then you implicitly know that the data in the cache is stale, and you can cause the multi-node invalidation to kick off, without "my" knowledge of it. But if you are just adding an entry to the local cache, you shouldn't make the assumption that other local caches on other nodes need to be invalidated.

jodydonetti commented 2 years ago

Yes, I understand, But from the implementors perspective we can't have a global multi-node cache invalidation simply because a new value is placed into the cache.

Agree

I was just saying that, behind the scenes, the implementation needs to not push a distributed invalidation on add only on replacement of an item you are already caching

This is exactly why I asked you my previous question, this:

My question is: when putting something in the cache for the key "foo", why it is so important for you to know if something was already there or not (so the difference between replacing and adding something to the cache, like you said)? I don't understand this point, and I have a hunch that you would use that information on one node to kind of deduce the state of the cache on the other nodes (which cannot possibly work).

As I read it you are implying (and correct me if I'm wrong) that if there's an "add" on node N1, we don't need to tell the other nodes of it, but if there's a "replace" then we need to. But this is not the case, because you are deducing the state of the memory caches on the other nodes based on the local state on a single node, and that's not how it works when going distributed.

I hope that with a practical example I can explain myself better.

Say we have 3 nodes (N1, N2 and N3) each with a local memory cache, without a distributed cache and with a backplane. All the memory caches start empty:

N1 asks the cache for "foo" -> it's not there -> get from database the value 10 -> save it in the cache for 10 min -> NO notification
N2 asks the cache for "foo" -> it's not there -> get from database the value 20 (because it is since changed) -> save it in the cache for 10 min -> NO notification
now N1 and N2 are not aligned anymore, at least for 10 min

See what I am saying?

And this would not only happen at every start with an empty cache, but every first time that an entry is asked on each node. Also, it will happen every time an entry is asked on each node, even if it has been already asked but is since expired, because when an entry is expired is like it was never there.

So, basically, it will happen continually.

If you are saying that the cache implementation can't discern this then it probably should never do the invalidation broadcast ever unless the implementor requests it.

Exactly! And that is what I meant when I said:

you don't want a distributed cache? you must send notifications manually

Just to be extra clear: technically the cache, internally, MAY know if it is adding or replacing, BUT it does not matter. Why? Because looking at my example above, I do not think it means what you think it means (anche I can finally use this meme 😂)

![image](https://user-images.githubusercontent.com/1010086/145044803-90573508-58a1-42c6-853c-3686bb5069ac.png)

I'm sorry for not being clear

No worries! I haven't probably be clear myself, too. Again, it's a complicated issue in a distributed environment involving mutable data semi-shared with different nodes. Totally not easy to communicate intent. Also, me not being a native speaker probably doesn't help.

I the implementor doesn't care about whether it's an add or an update. But I do care that I am not causing a cascading invalidation.

Agree, and that is why I came to the conclusion that, in a scenario without a distributed cache, the only way for it to work well without cascading invalidation or not synchronized data, would be for the implementor to manually send notifications, without anything automatic.

What do you think? Did I miss something?

jasenf commented 2 years ago

I don't think cache invalidation across nodes should happen on a local addition. Assuming updates and deletes were all synchronized appropriately in the first place, there is no need to assume any data in other nodes is out of sync. Cascading invalidations every time an entry is added to the local cache would be very inefficient and is only required if we are assuming the data is stale in other nodes. But there's no reason to make that assumption.

If all inter-node invalidation happens from a manual call by the implementor however, I guess this is a moot point :-)

jodydonetti commented 2 years ago

Assuming updates and deletes were all synchronized appropriately in the first place there is no need to assume any data in other nodes is out of sync.

Of course I don't know about your personal experience, so your mileage may vary, but in distributed computing it almost never happens that everything is all synchronized appropriately and that is why you should have a defensive approach and always keep in mind the fallacies of distributed computing.

Cascading invalidations every time an entry is added to the local cache would be very inefficient

Agree, that is why I switched to the "manual notifications" option when without the distributed cache.

and is only required if we are assuming the data is stale in other nodes. But there's no reason to make that assumption.

That's how distributed systems work, everything is kinda unreliable, potentially stale, almost nothing is perfectly synchronized and you cannot assume almost anything. That is why there's a whole family of protocols and algorithms like split-brain oe leader election or gossip protocols. They are not related specifically to caching, but to distributed computing and shared distributed state in general.

If all inter-node invalidation happens from a manual call by the implementor however, I guess this is a moot point :-)

That is the only way I see it working honestly 🤷‍♂️

jodydonetti commented 2 years ago

Also, even assuming like you said that "updates and deletes were all synchronized appropriately in the first place", I really don't understand how my previous example could possible work.

I'm talking about this:

Say we have 3 nodes (N1, N2 and N3) each with a local memory cache, without a distributed cache and with a backplane. All the memory caches start empty:

N1 asks the cache for "foo" -> it's not there -> get from database the value 10 -> save it in the cache for 10 min -> NO notification

N2 asks the cache for "foo" -> it's not there -> get from database the value 20 (because it is since changed) -> save it in the cache for 10 min -> NO notification

now N1 and N2 are not aligned anymore, at least for 10 min

Can you please help me with this? I honestly don't see any possible way for this to work based on your assumptions.

jasenf commented 2 years ago

No problem, it's a healthy conversation. For some context, this scenario I am discussing is currently exactly how our current implementation works. We notify/invalidate nodes on updates and removal only.

Your second bullet item is where the scenario doesn't seem complete. The "because it's changed" portion. If it was changed, someone notified N1 and it was invalided there.

We can have a library that acts defensively and assumes nothing is working reliably (in this case, invalidate everything even on Get) or we can have one where there's an assumption of things working properly.

For our production system i would rather have a library that assumes things are working and be able to resolve problems when they are not.. rather than incurring the performance, cost, and overhead of the defensive model.

Imagine what happens every time I restart one of my servers, all the initial loads would just invalidate everyone across the board as all our data is warmed up.

jodydonetti commented 2 years ago

No problem, it's a healthy conversation.

I agree, and thanks for your time.

Your second bullet item is where the scenario doesn't seem complete. The "because it's changed" portion. If it was changed, someone notified N1 and it was invalided there.

Oh, I think I finally got the important part. So in this scenario you are basically excluding the possibility of any "untracked change", where with "untracked change" I mean a change made by something (an app, service, cron job, tool, etc) not directly connected to the same backplane for notifications. In this case yes, my previous example would work (because as you said at step 2 the node N1 would receive a notification).

But I would like to tell you why I (unconsciously) excluded that possibility from start, maybe we can reason about it.

If you only think about specific cache entries that can be "tracked" then yeah, it makes sense. But, at least in my case, a lot of times I have stuff in my cache that changes based on external, untrackable factors: for those cases it's not a problem if the data is stale (using a cache basically means accepting that) but at least the data should be the same on all nodes.

A practical example - let's say about products - can be a set of cache entries with:

the list of the last 10 products
or the list of products in the category 1
or the result of a user search on products by text "paper"

When a new product is added to the system or an existing one is updated or deleted, even if you send a notification for the cache key "product/123" (just an example) all the other related cache entries (which depend on products) will not be notified. So, the cache entries for "products/latest/10" or "products/bycategory/1" or "products/search/bytext/paper" will not receive a notification, and as soon as they will be accessed via different nodes they will start to get out of sync. And you cannot send notifications for those extra cache keys too, because you can't possibly know which searches have been done by which users, or which filters by category have been appliead, with which pagination, and so on. It would be too much.

How would you handle such a situation? Or maybe your context is different and you don't cache in this way.

jasenf commented 2 years ago

Take a look at the ingenious way https://github.com/VahidN/EFCoreSecondLevelCacheInterceptor handles this issue..

it's so elegant it's almost criminal.

He ties the cache directly in to EFCore, so he knows on every transaction what table is being updated.

So he keeps a secondary manifest entry for each table that contains all other cache entries for that table. Then whenever a change is made on a table that contains an element from a different key, he knows which other elements to invalidate.

I can't even explain it as elegantly as he built it. Anyway, just my two cents: those are problems for the cache implementor to deal with.

jodydonetti commented 2 years ago

Take a look at the ingenious way https://github.com/VahidN/EFCoreSecondLevelCacheInterceptor handles this issue.. it's so elegant it's almost criminal. [...] Anyway, just my two cents: those are problems for the cache implementor to deal with.

Yes, very elegant! And in this case since the EFCore interceptor uses the cache, it is in practice an implementor itself, based on how you see it, right?

So, now that I feel we cleared our misunderstandings, does my previous summary feels right to you? I mean this:

you want all automatically managed? you must have a distributed cache: this means that you just put stuff in the cache (via Get/GetOrSet) or remove stuff from the cache (via Remove), and the synchronization with the other nodes happens automatically, you don't have to explicitly say anything more. But for this to work, a distributed cache is needed

you don't want a distributed cache? you must send notifications manually: this means that if you don't want to have a distributed cache, automatic synchronization cannot work well so you must manually send eviction notifications, because without an external shared state (the distributed cache) only you will know when something is effectively changed and the other nodes need to eventually load new data from the database

If you agree with this, intended as we explained to each other, I think we are on the same page.

jodydonetti commented 2 years ago

Also, featurewise, this means having 2 extra features than what is my current design:

the ability to MANUALLY send notifications, via something like a NotifyEviction method (not the final name)
the ability to DISABLE automatic notifications on SET/REMOVE events, so that you can send them manually when you know that's the case

These 2 things shoudl cover your use case. Do you agree?

jasenf commented 2 years ago

Yes, very elegant! And in this case since the EFCore interceptor uses the cache, it is in practice an implementor itself, based on how you see it, right?

correct.

So, now that I feel we cleared our misunderstandings, does my previous summary feels right to you? I mean this:

you want all automatically managed? you must have a distributed cache: this means that you just put stuff in the cache (via Get/GetOrSet) or remove stuff from the cache (via Remove), and the synchronization with the other nodes happens automatically, you don't have to explicitly say anything more. But for this to work, a distributed cache is needed

you don't want a distributed cache? you must send notifications manually: this means that if you don't want to have a distributed cache, automatic synchronization cannot work well so you must manually send eviction notifications, because without an external shared state (the distributed cache) only you will know when something is effectively changed and the other nodes need to eventually load new data from the database

That would be sufficient. Ultimately it would be great to have a cache implmenention/interface that was a bit more aware of things so we wouldn't have to do all this work, but this gets the job done (I think -- haha)

If you agree with this, intended as we explained to each other, I think we are on the same page.

jodydonetti commented 2 years ago

That would be sufficient.

Awesome 🎉 I can now proceed with designing and implementing these features. Will update as soon as there's something ready.

Ultimately it would be great to have a cache implmenention/interface that was a bit more aware of things so we wouldn't have to do all this work, but this gets the job done (I think -- haha)

I'm interested: if you have something in mind, even just a rough proposal for the api surface area so I can start thinking about it. Honestly with all the different use cases, limitations, and exceptions to the rule we highlighted, right now I fail to see a better version of this that does not require manual notifications and that does not cause cascading updates or stuff like that, but I'm all ears if something pops up!

Thanks again: it is always intriguing to see different contexts and scenarios, and it has been a very interesting and fruitful conversation.

jodydonetti commented 2 years ago

Hi there, I'm happy to say that I've finally been able to complete the design and implementation of the backplane feature.

And yes, you can disable automatic notifications per-operation for maximum control, on top of being able to manually send them whenever you want: this means you can have memory-only caches + backplane, without a distributed cache 😉🎉

Please take a look here, try it out and let me know what you think so I can move forward with the final version.

Thanks everybody 🙏

📦 It's a pre-release!

Note that the Nuget package is marked as pre-release so please be sure to enable the related filter otherwise you would not see them:

jodydonetti commented 2 years ago

Meanwhile I published an (hopefully) even better alpha2 release.

jodydonetti commented 2 years ago

Hi all, just wanted to update you on the next version: I released right now the BETA2 for the next big release, which includes among other small things a big fix for the DI setup part.

This will probably be the last release before the official one.

fmendez89 commented 2 years ago

Thanks @jodydonetti for your hard work and commitment 😊 we are not on the point to implement a cache yet, but will definitely try fusion cache out.

jodydonetti commented 2 years ago

Thank you @fmendez89 I appreciate your kind words. I'm glad you are interested in trying FusionCache and I hope you'll like it, let me know if you need anything!

jodydonetti commented 2 years ago

Hi all, yesterday I released the BETA 3.

Unless some big problem comes up, this will be the very last release before the official one, which will be in the next few days 🎉

jodydonetti commented 2 years ago

And here we go: v0.9 is finally out 🥳🎉

The final backplane design includes support for being able to work with or without a distributed cache (see below here).

Thanks everybody for the involvement, and if you try it out let me know!

ZiggyCreatures / FusionCache

🙋‍♂️ Memory + Backplane, but no distributed cache? #36

The Idea

Short Answer

Longer Answer

But Why?

Example

A different approach

Example

A different approach (reprise)

So What?

On my verbosity 🙄

📦 It's a pre-release!