ZiggyCreatures / FusionCache

FusionCache is an easy to use, fast and robust hybrid cache with advanced resiliency features.
MIT License
1.71k stars 90 forks source link

[Question] MemoryCache with cloned objects on Get, atomic update of cache #262

Closed kzkzg closed 2 months ago

kzkzg commented 3 months ago

Hi,

  1. I'm using default IMemoryCache, which keeps reference to object, so when i get value from cache and change something with it, this change apply to whole cache. Is it possible that memorycache return deepcloned objects on Get? I guess solution could be to pass custom implementation of IMemoryCache which deepclone objects on read. Do you know some other implementations of IMemoryCachen which can be used to achive that? Or mayby there is other way to do that?

  2. I have simple code, get from cache, update value, and Set to cache again, everything is in Redis Redlock block

    
    using var redLock = await LockHelper.AquireLockAsync(key)...
    var result = await _fusionCache.TryGetAsync....
    result = ProcessResult(result)...
    await _fusionCache.SetAsync(key, result...
is it better way to do that? Mayby some specialized method which could be used like this:

_fusionCache.UpateAsync(key, ProccessResult)

DanielStout5 commented 3 months ago

I'm interested in that first item as well (having the MemoryCache items be cloned, but not the DistributedCache ones)

I see the same request here: https://github.com/ZiggyCreatures/FusionCache/issues/194

That used the same idea you suggested (custom IMemoryCache with cloning) but I actually don't think that's a good idea with the implementation shown, because what FusionCache stores in the IMemoryCache is not just the object you return but a wrapping FusionCacheMemoryEntry, and the Metadata field (and possibly others) on that wrapper are modified by FusionCache without re-setting them in the MemoryCache, e.g. here: entry.Metadata.LogicalExpiration = DateTimeOffset.UtcNow.AddMilliseconds(-10);

If our custom MemoryCache clones all returned values, then that entry was cloned upon retrieving from the cache, and the metadata changed by FusionCache won't actually be persisted, and it won't expire at the expected time.

I think we could get around this by having a custom handling for IFusionCacheMemoryEntry where only the Value property would actually get cloned, but I still don't love that solution since the value would be cloned even if it's not actually getting returned to the caller (i.e. FusionCache wants to change the Metadata). Also, that interface and its implementation are both internal so we would have to use reflection to do that customized cloning.

It seems like ideally this would be something built into FusionCache. I think it would make sense to re-use the existing IFusionCacheSerializer interface to perform cloning, so it behaves just like the DistributedCache.

E.g. a new method on FusionCache:

/// <summary>Clone all values retrieved from the <see cref="IMemoryCache"/> via the provided serializer.</summary>
public IFusionCache EnableMemoryCacheEntryCloning(IFusionCacheSerializer serializer)

And if that serializer is set, whenever a value is going to be returned from the MemoryCache by FusionCache (calls to entry.GetValue<T>) it would go through the serializer to get a clone:

return serializer.Deserialize<T>(serializer.Serialize(entry.GetValue<T>))

jodydonetti commented 3 months ago

Hi @kzkzg and thanks for using FusionCache.

  1. I'm using default IMemoryCache, which keeps reference to object, so when i get value from cache and change something with it, this change apply to whole cache.

Something important to keep in mind is that when you get something from a cache you cannot modify it:

Something I always suggest and that served me well for decades is to be explicit when caching is involved. If you are only reading data then it can be transparent and "magical", think the "Q" in CQRS for example. If you are reading data to then modify it and save it, then never use caching. If you have some sort of unified datalayer (think repository pattern etc) with the usual Get methods then make it explicit if the method allows cached data or not, so that the caller can specify that based on what it needs to do. It can even be just something like a bool allowCachedData param, even optional with a default value of true so 90% of the queries will take advantage of caching, while allowing the remaining 10% to avoid sync or shared data issues.

Is it possible that memorycache return deepcloned objects on Get? I guess solution could be to pass custom implementation of IMemoryCache which deepclone objects on read.

This is exactly the solution another FusionCache user used, see here.

Do you know some other implementations of IMemoryCachen which can be used to achive that? Or mayby there is other way to do that?

Somebody else asked me for the auto-clone feature, but I never actually planne for it, for a couple of reasons.

First, knowing it will create a clone for every single get operation gives me shivers 😅. Jokes aside, since it would be an opt-in, I may accept that.

Second, how would the clone be done? ICloneable as we all know is basically deprecated and not recommended (see here). Another idea may be to just use the already existing serializer used for the distributed cache: what do you think about it? I mean it basically already does that, even though it's not a back-to-back serialize/deserialize for every single Get call, which... see point 1.

Any opinions? I'm open to ideas.

  1. I have simple code, get from cache, update value, and Set to cache again, everything is in Redis Redlock block
using var redLock = await LockHelper.AquireLockAsync(key)...
var result = await _fusionCache.TryGetAsync....
result = ProcessResult(result)...
await _fusionCache.SetAsync(key, result...

is it better way to do that? Mayby some specialized method which could be used like this:

_fusionCache.UpateAsync(key, ProccessResult)

No, not currently, but I'm open to ideas.

One thing to notice is that I'm already playing with a native redlock-like feature, to allow for cache stampede protection in a multi-node environment. The thing gets pretty complicated pretty fast when you throw in expected features (at least for FusionCache users) like Soft/Hard Timeouts, Auto-Recovery and so on because I need to be able to clearly differentiate between waiting for a distributed lock because another node is executing something, and waiting for a distributed lock because Redis is down or super slow right now...

I have an experimental branch with support for an IDistributedLocker (on top of the already existing IMemoryLocker) and let me tell you it ain't easy 😅

Hope this helps, let me know!

jodydonetti commented 3 months ago

Hi @DanielStout5 read my answer to @kzkzg and let me know what you think, I'm interested!

I see the same request here: #194

That used the same idea you suggested (custom IMemoryCache with cloning) but I actually don't think that's a good idea with the implementation shown, because what FusionCache stores in the IMemoryCache is not just the object you return but a wrapping FusionCacheMemoryEntry, and the Metadata field (and possibly others) on that wrapper are modified by FusionCache without re-setting them in the MemoryCache, e.g. here: entry.Metadata.LogicalExpiration = DateTimeOffset.UtcNow.AddMilliseconds(-10);

If our custom MemoryCache clones all returned values, then that entry was cloned upon retrieving from the cache, and the metadata changed by FusionCache won't actually be persisted, and it won't expire at the expected time.

Ah, dang it, you may be right 🤔

I think we could get around this by having a custom handling for IFusionCacheMemoryEntry where only the Value property would actually get cloned, but I still don't love that solution since the value would be cloned even if it's not actually getting returned to the caller (i.e. FusionCache wants to change the Metadata). Also, that interface and its implementation are both internal so we would have to use reflection to do that customized cloning.

It seems like ideally this would be something built into FusionCache. I think it would make sense to re-use the existing IFusionCacheSerializer interface to perform cloning, so it behaves just like the DistributedCache.

Let me think more about it, maybe it really is time I add the auto-cloning feature after all...

E.g. a new method on FusionCache:

/// <summary>Clone all values retrieved from the <see cref="IMemoryCache"/> via the provided serializer.</summary>
public IFusionCache EnableMemoryCacheEntryCloning(IFusionCacheSerializer serializer)

And if that serializer is set, whenever a value is going to be returned from the MemoryCache by FusionCache (calls to entry.GetValue<T>) it would go through the serializer to get a clone:

return serializer.Deserialize<T>(serializer.Serialize(entry.GetValue<T>))

Yes, this should in fact work.

As I said to @kzkzg the constant serialize+deserialize really gives me shivers 😅 but it would be a feature disabled by default, and if you opt-in, you have taken a careful decision so... yeah, it can make sense afterall. I may even be able to optimize it a little bit, with some internal tricks... I'm thinking about it...

Also it can be controllable via an entry option, so you can have a very granular control for every call, but also set it once in the DefaultEntryOptions so you only have to set it once.

Let me know what you think while I marinate this idea a little bit...

Meanwhile, thanks for chipping in!

DanielStout5 commented 3 months ago

Sounds reasonable to me! Making it configurable per call could be useful in cases where some entities are immutable and others are mutable and so must be cloned.

jodydonetti commented 3 months ago

Hi @kzkzg , what do you think?

kzkzg commented 2 months ago

Hi @jodydonetti , sorry for late answer. Configurable cloning feature per call would be great, really waiting for that.

For now I'm simply serialize and derialize anyway using json, but in high traffic sometimes serialization doesn't finish in time, and the value of some collection in the cache changes, and I get an exception. Sometimes retrying after a short delay helps, but sometimes it doesn't.

As Daniel mentioned, mayby use existing serializers, and allow passing in a custom implementation if needed.

jodydonetti commented 2 months ago

Hi all, Auto-Cloning is coming in v1.3.0 which will be out... I think later today 😬

Will update later.

jodydonetti commented 2 months ago

Hi all, v1.3.0 is out 🥳