[BUG] Chaining call of RemoveAsync doesn't work

apavelm commented 5 months ago

Just to simplify the explanation, I'll describe in some abstraction.

There are models: Item and List. Item is an object with some Id. Therefore something like:

{
  "id": 1,
  "name": "some value"
}

List is a collection of item IDs. So, it it like:

[1,2,3,4]

We are caching both, List entirely and Item by Id.

There are methods (simplifying):

GetList
GetById
CreateNew
UpdateById
DeleteById

GetList retrieves the data from DB and caching it using GetOrSetAsync with "ListCacheKey" and returns List GetById retrieves the data of single Item by id using GetOrSetAsync with "Item-{id}" as a cache key CreateNew creates a record in DB and removes "ListCacheKey" from cache. UpdateById updates a record in DB and removes "Item-{id}" from cache. DeleteById removes a record in DB and removes both cache records "Item-{id}" and "ListCacheKey" from cache. NOTE: issue is here

Also, Redis is used as a backplane and all sync operations are async (this is important I suppose).

Issue description: The chain of calling to reproduce: GetList -> CreateNew -> GetList -> DeleteById -> GetList. At the last call of GetList -> all items, including the deleted one are returned

inside DeleteById after removing record from the DB we calling:

await _cache.RemoveAsync("ListCacheKey").ConfigureAwait(false);
await _cache.RemoveAsync($"Item-{id}").ConfigureAwait(false);

I used debugger to ensure about the issue, and I was able to confirm this. Keys matches. So, it looks like not a stupid typo or something like that. I suspect async operations, but I would prefer not to to re-enable it, if possible.

@jodydonetti Please advice how to fix this. Thank you in advance.

henriqueholtz commented 5 months ago

@apavelm Why it was closed without any update? Could you explain please?!

apavelm commented 5 months ago

@henriqueholtz I'm not sure what it was, perhaps something abnormal. I'm still trying to understand, but at some point it suddenly started working as supposed.

jodydonetti commented 5 months ago

Hi @apavelm and thanks for using FusionCache!

This is realy strange, and even more so that it started working after some time, would like to investigate more...

A couple of questions:

you talked about a Redis backplane, can I suppose you are also using a Redis distributed cache?
are you in a multi-node environment? (this is important)

What I think may have happened (but without more info I'm just spitballing here) is that there has been some transient problems related to your Redis instance, maybe a temporary slowdown or something, which in turn may have created a temporary out-of-sync issue.

For example:

various previous operations...
Delete execued on node N1, with these internal details:
- delete in memory cache on N1 🟢 OK
- delete in distributed cache 🟢 OK
- backplane notification 🟢 OK
GetAll executed on node N2, before the notification from N1 is arrived

In this case, even if everything went fine, check for the options AllowBackgroundDistributedCacheOperations and AllowBackgroundBackplaneOperations. If one or both of them are true it can happens what is described above, meaning a backplane notification can take a little more time to reach N2 and a GetAll on N2 has been already executed.

Another example:

various previous operations...
Delete execued on node N1, with these internal details:
- delete in memory cache on N1 🟢 OK
- delete in distributed cache 🟢 OK
- backplane notification 🔴 FAIL (because of a transient Redis error, a network error, etc)
GetAll executed on node N2, but the notification from N1 had an error

Something to keep in mind in this case is that luckily Auto-Recovery automatically handles this (and more), and will retry later to make everything in-sync again. Having said that, it takes a little time to do that, so an immediate GetAll on N2 will not see the update immediately.

If instead you are working with one node or anyway are sure all the calls ended up on the very same node, then... well, this is something I never ever observed 🤔

Hope this helps, let me know!

ZiggyCreatures / FusionCache

[BUG] Chaining call of RemoveAsync doesn't work #261