ZiggyCreatures / FusionCache

FusionCache is an easy to use, fast and robust hybrid cache with advanced resiliency features.
MIT License
1.56k stars 84 forks source link

Question: Factory soft timeouts and fail-safe values #268

Open R0boC0p opened 2 weeks ago

R0boC0p commented 2 weeks ago

Hi, I have been reading the docs a few times by now but there are still some things that aren't obvious/clear to me:

opt.Duration = TimeSpan.FromMinutes(5);
opt.FactorySoftTimeout = TimeSpan.FromMilliseconds(10);
opt.IsFailSafeEnabled = true;
opt.FailSafeMaxDuration = TimeSpan.FromMinutes(15);

opt.DistributedCacheDuration = TimeSpan.FromMinutes(15);
opt.DistributedCacheSoftTimeout = TimeSpan.FromMilliseconds(60);
opt.DistributedCacheFailSafeMaxDuration = TimeSpan.FromHours(4);

Scenario:

There has been a factory call. Memory and Distributed cache are set. The memory cache expires and there is a call to GetOrSetAsync(), the factory exceeds the 10ms. Value in distr. cache is avail and not expired.

  1. is it returning the fail safe of the memory cache or bubbling up to the L2 (distributed cache)?
  2. if it's bubbling up to the L2, and it's - again - exceeding the distr. soft-timeout of 60ms, is it returning the failsafe value?
  3. Is the distributed cache solely to easy node start-ups and take a way the load from DB?
  4. if there are failsafe and L2 enabled, where is the fall-back for L2 used from? From the memory-cache layer?
  5. Is the fall-back-value of the memory-layer also the fall-back of the distributed layer? I am asking because I want to get the memory footprint straight. (double mem usage per item as fall-back-value for mem and distr. cache)
  6. assuming that both soft timeouts are hit, and it's returning a failsafe: why would I ever set a soft-timeout longer than 1ms? as this would give me the shortest waiting time, but still assure an update. Can it be set to TimeSpan.Zero?

What is my goal? I want the shortest possible waiting time for the factories. I do not care if I get an outdated value.

Many thanks R0b

jodydonetti commented 2 weeks ago

Hi @R0boC0p ,

Scenario:

There has been a factory call. Memory and Distributed cache are set. The memory cache expires and there is a call to GetOrSetAsync(), the factory exceeds the 10ms. Value in distr. cache is avail and not expired.

The flow is different: if the memory cache is expired but the distributed cache is not, the factory would not be called.

To better understand, this is the (simplified) flow with the steps in order:

What is my goal? I want the shortest possible waiting time for the factories. I do not care if I get an outdated value.

If you want to wait as little as possible I would suggest to enable Eager Refresh: in this way, a non-blocking background factory will be run even before a value will expire (watch out to setting the value to something too small).

Please let me know if it worked.

Hope this helps!

R0boC0p commented 2 weeks ago

Hi, thank you for your response. I did read the manually thoroughly, I am aware about the distr. background-refresh, and the eager refresh. My questions were more like about the fail-safe.

these are the actual options I am driving:

  .WithDefaultEntryOptions(opt =>
  {
      opt.Duration = TimeSpan.FromMinutes(2);
      opt.JitterMaxDuration = TimeSpan.FromMinutes(1);
      opt.FactorySoftTimeout = TimeSpan.FromMilliseconds(10);
      opt.FactoryHardTimeout = TimeSpan.FromSeconds(10);
      opt.EagerRefreshThreshold = .967f;
      opt.FailSafeMaxDuration = TimeSpan.FromMinutes(15);
      opt.IsFailSafeEnabled = true;
      opt.AllowBackgroundBackplaneOperations = true;
      opt.AllowTimedOutFactoryBackgroundCompletion = true;

      opt.DistributedCacheSoftTimeout = TimeSpan.FromMilliseconds(50);
      opt.DistributedCacheHardTimeout = TimeSpan.FromSeconds(10);
      opt.DistributedCacheFailSafeMaxDuration = TimeSpan.FromHours(1); << HERE TTL in redis
      opt.AllowBackgroundDistributedCacheOperations = true;

      opt.FailSafeThrottleDuration = TimeSpan.FromSeconds(30);
  })

I am wondering from where the failsafe value is being taken in case both, memory and distributed values are expired. The TTL value is taken from the DistributedCacheFailSafeMaxDuration, so I am wondering how the fail-safe value is being calculated. Is it taken from the memory-cache or does the workflow you describe kick in? check the distributed cache: second, slower than memory, faster the database, incurs network + (de)serialization

How would I ever benefit from the mem-cache fail-safe value if I use a distributed cache?

From what you say an issue I see here is, that the DistributedCacheFailSafeMaxDuration is preventing the factory to be called for 1hour, so if the mem-cache refreshes, it's from an old value + the distr. cache flow overhead?

Many thanks

R0boC0p commented 2 weeks ago

From what I see in your code is that if the fail-safe is enabled, and there is a cache-entry available (presumably expired is ok), the distr. cache is never checked.

https://github.com/ZiggyCreatures/FusionCache/blob/main/src/ZiggyCreatures.FusionCache/FusionCache_Async.cs#L74-110 if (memoryLockObj is null && options.IsFailSafeEnabled && memoryEntry is not null)

I am struggling to understand where the fail-safe value comes from if it reaches the distr. cache path below.

If you want to wait as little as possible I would suggest to enable Eager Refresh: in this way, a non-blocking background factory will be run even before a value will expire (watch out to setting the value to something too small).

Ok, I think I have been all a bit slow in here. The eager refresh should prevent the fail-safe scenarios at all. Considering CompleteBackgroundFactory, there aren't any soft-timeouts that apply when executing the eager refresh?

Thanks again

jodydonetti commented 2 weeks ago

From what I see in your code is that if the fail-safe is enabled, and there is a cache-entry available (presumably expired is ok), the distr. cache is never checked.

https://github.com/ZiggyCreatures/FusionCache/blob/main/src/ZiggyCreatures.FusionCache/FusionCache_Async.cs#L74-110

Hi @R0boC0p , not really: what you described is true ONLY if the lock has not been acquired, as the comment says:

// IF THE MEMORY LOCK HAS NOT BEEN ACQUIRED
// + THERE IS A FALLBACK ENTRY
// + FAIL-SAFE IS ENABLED
// --> USE IT (WITHOUT SAVING IT, SINCE THE ALREADY RUNNING FACTORY WILL DO IT ANYWAY)

This is a very special case that can happen ONLY when a LockTimeout has been specified (and by default it is not) AND a lock has not been acquired, meaning the lock has been acquired already by someone else and, while the factory has been running, the specified lock timeout for another thread kicked in. In this case it is better to just return what has been found in the memory cache, even if expired, but ONLY if fail-safe is enabled.

Does this make sense?

Ok, I think I have been all a bit slow in here. The eager refresh should prevent the fail-safe scenarios at all.

Not completely: if no request comes in AFTER the eager refresh threshold and BEFORE the expiration, the next request will start a normal refresh cycle (so, no eager refresh).

Considering CompleteBackgroundFactory, there aren't any soft-timeouts that apply when executing the eager refresh?

Exactly, and that is because the factory is already being execuetd in the background and is not blocking anything.

Thanks again

Thanks to you for chipping in!