ZiggyCreatures / FusionCache

FusionCache is an easy to use, fast and robust hybrid cache with advanced resiliency features.
MIT License
1.71k stars 90 forks source link

Syncing different cloud instances #210

Closed apavelm closed 4 months ago

apavelm commented 6 months ago

This is not a bug report. This is more about the complex scenario that not mentioned in documentation about syncing cache.

Lets say in AspNet Core API i have the following configuration:


var redisConnString = configuration.GetConnectionString("RedisConnString");
//ADD SERVICES: REDIS
 if (!string.IsNullOrWhiteSpace(redisConnString))
{
    // ADD SERVICES: REDIS DISTRIBUTED CACHE
    services.AddStackExchangeRedisCache(options =>
    {
        options.Configuration = redisConnString;
    });

    // ADD SERVICES: JSON SERIALIZER
    services.AddFusionCacheSystemTextJsonSerializer(SystemTextJsonExtension.DefaultOptions);

    // ADD SERVICES: REDIS BACKPLANE
    services.AddFusionCacheStackExchangeRedisBackplane(options =>
    {
        options.Configuration = redisConnString;
    });
}
else
{
    // just a backup that shouldn't be the case
    services.AddDistributedMemoryCache(o => o.ExpirationScanFrequency = TimeSpan.FromSeconds(30));
}

services.AddFusionCache();

foreach (...list of cache instances....)
{
...
var cacheInstanceName = "..some name from the loop..."; // the list includes "mainCache"
var entryOptions = new {}; // cache options for the corresponsing cacheInstanceName 
...
    services.AddFusionCache(cacheInstanceName)
        .WithDefaultEntryOptions();
}

API could have several instances in a cluster (Azure Application Services) to balance the load. All cache instances must be shared between all API instances. I believe that the configuration above suit for that. But the question: how fast syncing is?

I also have Azure Functions App. Azure Function App must share with APIs the only one cacheInstance, Let name it "mainCache". Sync speed must be close to instant (of course, considering some technical limitations).

Could you please point me out to the correct paragraph in documentation or help me to configure FusionCache on Api and FuncApp side to reach the goal.

Please advicse.

Thanks.

@jodydonetti It would be very appreciated if you could help.

apavelm commented 6 months ago

Just to highlight the main-line questions:

jodydonetti commented 6 months ago

Hi @apavelm and thanks for using FusionCache!

Sorry for the delay but I was at the MVP Global Summit and was not able to look into this. I'll look into this asap and will let you know.

Thanks!

jodydonetti commented 5 months ago

Wow, in the end I forgot to answer this, sorry about it 🥲

Now regarding your request: first thing I think there's a small typo here:

foreach (...list of cache instances....)
{
...
var cacheInstanceName = "..some name from the loop..."; // the list includes "mainCache"
var entryOptions = new {}; // cache options for the corresponsing cacheInstanceName 
...
    services.AddFusionCache(cacheInstanceName)
        .WithDefaultEntryOptions();
}

where it should be:

foreach (...list of cache instances....)
{
...
var cacheInstanceName = "..some name from the loop..."; // the list includes "mainCache"
var entryOptions = new {}; // cache options for the corresponsing cacheInstanceName 
...
    services.AddFusionCache(cacheInstanceName)
        .WithDefaultEntryOptions(entryOptions); // PASS THE ENTRY OPTIONS CREATED ABOVE
}

Second: is there a reason why also registering the MemoryDistributedCache? FusionCache can be used even just as an L1 cache (meaning only the first level, in memory) without having to change the calling sites at all. This means that you can transparently work with it both with an L2 (second level, distributed cache) or without it, again transparently.

I'll continue with the code below with this case in mind, but you can easily change it to get back to your version.

Third: if the different cache instances all point to the same Redis instance but they store different pieces of data with maybe the same cache key, you may get cache key collisions (the cache key "foo" used in cacheA and cacheB will have different memory levels, but the same distributed level). To avoid this you can simply enable cache key prefix by calling WithCacheKeyPrefix() on the builder: this will use the cache name as a prefix (plus some separator), or you can even specify a custom prefix by using WithCacheKeyPrefix("MyPrefix").

Fourth: by just registering the distributed cache/serializer/backplane at the top you are not actually using them. What you can do is either register them once like you did, but then tell each FusionCache instance to use the registered one, like this:

services.AddFusionCache()
  .TryWithRegisteredDistributedCache();

foreach (...list of cache instances....)
{
...
var cacheInstanceName = "..some name from the loop..."; // the list includes "mainCache"
var entryOptions = new {}; // cache options for the corresponsing cacheInstanceName 
...
services.AddFusionCache(cacheInstanceName)
  .WithDefaultEntryOptions(entryOptions)
  .WithRegisteredDistributedCache()
  .WithCacheKeyPrefix();
}

or you can simply configure them directly for each FusionCache insance like this:

var redisConnString = configuration.GetConnectionString("RedisConnString");

var b = services.AddFusionCache();
if (!string.IsNullOrWhiteSpace(redisConnString)) {
  b.WithDistributedCache(
    new RedisCache(Options.Create(new RedisCacheOptions() { Configuration = redisConnString })),
    new FusionCacheSystemTextJsonSerializer()
  )
  .WithCacheKeyPrefix()
  .WithBackplane(
    new RedisBackplane(new RedisBackplaneOptions() { Configuration = redisConnString })
  );
}

foreach (...list of cache instances....)
{
  ...
  var cacheInstanceName = "..some name from the loop..."; // the list includes "mainCache"
  var entryOptions = new {}; // cache options for the corresponsing cacheInstanceName 
  ...
  b = services.AddFusionCache(cacheInstanceName)
    .WithDefaultEntryOptions(entryOptions);

  if (!string.IsNullOrWhiteSpace(redisConnString)) {
    b.WithDistributedCache(
      new RedisCache(Options.Create(new RedisCacheOptions() { Configuration = redisConnString })),
      new FusionCacheSystemTextJsonSerializer()
    )
    .WithCacheKeyPrefix()
    .WithBackplane(
      new RedisBackplane(new RedisBackplaneOptions() { Configuration = redisConnString })
    );
  }
}

Then, if you declare an ext method like this:

public static IFusionCacheBuilder MaybeWithRedis(this IFusionCacheBuilder builder, string? connString)
{
  if (string.IsNullOrWhiteSpace(connString))
    return builder;

  return builder.WithDistributedCache(
      new RedisCache(Options.Create(new RedisCacheOptions() { Configuration = redisConnString })),
      new FusionCacheSystemTextJsonSerializer()
    )
    .WithCacheKeyPrefix()
    .WithBackplane(
      new RedisBackplane(new RedisBackplaneOptions() { Configuration = redisConnString })
    );
}

You can have a (imho) more readable setup like this:

var redisConnString = configuration.GetConnectionString("RedisConnString");

services.AddFusionCache()
  .MaybeWithRedis(redisConnString);

foreach (...list of cache instances....)
{
  ...
  var cacheInstanceName = "..some name from the loop..."; // the list includes "mainCache"
  var entryOptions = new {}; // cache options for the corresponsing cacheInstanceName 
  ...
  services.AddFusionCache(cacheInstanceName)
    .WithDefaultEntryOptions(entryOptions)
    .MaybeWithRedis(redisConnString);
}

Regarding how to sync only one of the named caches from a specific app/service: in that single app, just configure the one you need and not the others (but I'm not sure if I understood what you are asking).

Regarding the performance: FusionCache uses the underlying technology you picked, which in this case means Redis for the distributed cache + System.Text.Json as a serializer + Redis again as the backplane. The perf hit should be microscopic, meaning 90%+ of the cpu time is spent on those pieces and not on extra stuff done by FusionCache itself, although you'll get fail-safe, soft/hard timeouts, etc on top of the basic fetures. For semi-realtime sync, the Redis backplane uses the Pub/Sub mechanism of Redis itself, so the speed depends on the speed of that which, at least in my experience, is pretty darn fast (we are talking about a couple ms level speed).

Of course though, if you deploy an app service in northeurope, the database in eastus and the Redis cache in japaneast you may have some slowness, it goes without saying 😅

Hope this helps, please let me know if I missed something.

apavelm commented 5 months ago

Hi @jodydonetti Thank you very much for the information, I believe that at least a part of this post should be placed into documentation.

Since March I changed a little the code, so now it looks like:

public static void AddCaching(this IServiceCollection services, IConfiguration configuration)
{
    var redisConnString = configuration.GetConnectionString("RedisConnString");

    services.AddStackExchangeRedisCache(options =>
    {
        options.Configuration = redisConnString;
    });

    services.AddFusionCacheSystemTextJsonSerializer(SystemTextJsonExtension.GetCopyOfDefaultJsonSerializerOptions());

    services.AddFusionCacheStackExchangeRedisBackplane(options =>
    {
        options.Configuration = redisConnString;
    });

    services.AddCachingConfiguration(configuration);
}

public static void AddCachingConfiguration(this IServiceCollection services, IConfiguration configuration)
{
    var cachingSection = configuration.GetSection("Caching");
    var cachingSettings = cachingSection.Get<CachingSettings>();

    var init = new CacheSectionSettings()
    {
        CacheDuration = TimeSpan.FromMinutes(10),
        FailSafeMaxDuration = TimeSpan.FromMinutes(60),
        Jittering = TimeSpan.FromSeconds(10),
        FactorySoftTimeout = TimeSpan.FromMilliseconds(100),
        FactoryHardTimeout = TimeSpan.FromMilliseconds(1500)
    };

    var defaultSettings = cachingSettings.Values.GetValueOrDefault("Default", init);

    var list = FastEnum.GetValues<CachePartition>();
    foreach (CachePartition cachePartition in list)
    {
        var key = cachePartition.GetEnumMemberValue() ?? "Default";
        var settingsOptions = cachingSettings.Values.GetValueOrDefault(key, defaultSettings);

        var entryOptions = PrepareOptions(settingsOptions);

        services.RegisterCacheInstance(cachePartition, entryOptions);
    }
}

public static void RegisterCacheInstance(this IServiceCollection services, CachePartition cachePartition,
    FusionCacheEntryOptions options)
{
    services.AddFusionCache(); // this line is requried even if it is not used

    var key = cachePartition.GetEnumMemberValue() ?? "Default";
    services.AddFusionCache(key).WithDefaultEntryOptions(options);
}

private static FusionCacheEntryOptions PrepareOptions(CacheSectionSettings settings)
{
    var result = new FusionCacheEntryOptions();

    if (settings.CacheDuration.HasValue)
    {
        result.SetDuration(settings.CacheDuration.Value);
    }

    if (settings.EagerRefresh.HasValue)
    {
        result.SetEagerRefresh(settings.EagerRefresh.Value);
    }

    if (settings.Jittering.HasValue)
    {
        result.SetJittering(settings.Jittering.Value);
    }

    if (settings.FailSafeMaxDuration.HasValue || settings.FailSafeThrottleDuration.HasValue)
    {
        result.SetFailSafe(true, settings.FailSafeMaxDuration, settings.FailSafeThrottleDuration);
    }
    else
    {
        result.SetFailSafe(false);
    }

    if (settings.FactorySoftTimeout.HasValue || settings.FactoryHardTimeout.HasValue)
    {
        result.SetFactoryTimeouts(settings.FactorySoftTimeout, settings.FactoryHardTimeout);
    }

    return result;
}

I noticed an interesting thing: Default instance (without custom name = "FusionCache" by default) is required.

Thanks for mentioning WithCacheKeyPrefix it could be useful for sharing Redis between different environments if I understood the idea. Please correct me if: we can have several environments and each could have the own cluster of several application server instances, and each instance of the server could have several instances of FusionCache. Just by using the prefix-key.

For our purposes, MemoryCache is OK, but we need to sync the changes between all application instances. On a sample, Given Application 1 (App1) which has 2 or more instances running hidden by load balancer. What we need is to make sure that FusionCache "Partitions" synced instantly, so C1 from Instance1 to C1 from Instance2, and the same for C2. So it looks like (simplified)

                                    App1
                                  /      \
                          Instance1  Instance2
                        /       \        /      \
                      C1      C2        C1      C2

is there a reason why also registering the MemoryDistributedCache? Of course, no, it was removed, since Redis became mandatory and no such a backup needed.

I'm sorry, I'm confused a little. in your sample you suggest using "fluent" methods .WithRegisteredDistributedCache() and .WithBackplane() while in my code I'm using extension for IServiceCollection. Please explain the difference, which way is preferable (because I took it from the sample from documentation). So, I assume that according to the semantics, the approach I use is to register Redis as a backplane and serializer globally, isn't it?

Or these lines are just add a possibility to use Redis as a backplane and STJ as a serializer, but I need additionally to specify it against each FusionCache "Partition" ?

services.AddFusionCacheSystemTextJsonSerializer(SystemTextJsonExtension.GetCopyOfDefaultJsonSerializerOptions());

services.AddFusionCacheStackExchangeRedisBackplane(options =>
{
    options.Configuration = redisConnString;
});

This part is not clear to me, I'd appreciate any comment on this.

Also, the last question (I hope) about the syncing process.

We have 2 instances of the application (A1-1, A1-2) with cache that is syncing.

Step 0.
A1-1: [cacheKey1, cacheKey2, cacheKey3]
A1-2: [cacheKey1, cacheKey2, cacheKey3]

Step 1a
a new key added to cache on A1-2:
A1-1: [cacheKey1, cacheKey2, cacheKey3, cacheKey4]
A1-2: [cacheKey1, cacheKey2, cacheKey3]

Step 1b
Cache synced
A1-1: [cacheKey1, cacheKey2, cacheKey3, cacheKey4]
A1-2: [cacheKey1, cacheKey2, cacheKey3, cacheKey4]

Step 2
One instance (A1-2) restarted
A1-1: [cacheKey1, cacheKey2, cacheKey3, cacheKey4]
A1-2: []

**Q1: is it correct always? Or A1-2 cache will be fully synced against A1-1 when using Redis as L2 cache? Or any other option?**

Step 3
cacheKey2 was placed on A1-2
A1-1: [cacheKey1, cacheKey2, cacheKey3, cacheKey4]
A1-2: [cacheKey2]

**Q2: will the cache key be refreshed on A1-1 here?**

thank you

jodydonetti commented 4 months ago

Hi @apavelm

I noticed an interesting thing: Default instance (without custom name = "FusionCache" by default) is required.

No it shouldn't, why are you saying this? Do you have a minimal repro?

Just to doublecheck and to be sure this is not a problem in the future, I just added this test:

[Fact]
public void CanUseNamedCachesWithoutDefaultCache()
{
    var services = new ServiceCollection();

    services.AddFusionCache("Foo");
    services.AddFusionCache("Bar");

    using var serviceProvider = services.BuildServiceProvider();

    var cacheProvider = serviceProvider.GetRequiredService<IFusionCacheProvider>();

    var fooCache = cacheProvider.GetCache("Foo");
    var barCache = cacheProvider.GetCache("Bar");

    Assert.NotNull(fooCache);
    Assert.NotNull(barCache);
}

It passed first try.

Thanks for mentioning WithCacheKeyPrefix it could be useful for sharing Redis between different environments if I understood the idea.

It can be used to share the same Redis instance (which may be costly) with different named caches, different environment, etc. Basically by setting a custom prefix you can be sure there will not be cache-key collisions.

I'm sorry, I'm confused a little. in your sample you suggest using "fluent" methods .WithRegisteredDistributedCache() and .WithBackplane() while in my code I'm using extension for IServiceCollection. Please explain the difference

The WithRegisteredXyz() methods will look for the related registered component in the DI container, whereas with the specific WithXyz() methods you will provide the instance or the factory.

which way is preferable

It depends, there's no "better" in this case, it's really up to you.

Also, the last question (I hope) about the syncing process.


We have 2 instances of the application (A1-1, A1-2) with cache that is syncing.

Step 0.
A1-1: [cacheKey1, cacheKey2, cacheKey3]
A1-2: [cacheKey1, cacheKey2, cacheKey3]

Step 1a
a new key added to cache on A1-2:
A1-1: [cacheKey1, cacheKey2, cacheKey3, cacheKey4]
A1-2: [cacheKey1, cacheKey2, cacheKey3]

Step 1b
Cache synced
A1-1: [cacheKey1, cacheKey2, cacheKey3, cacheKey4]
A1-2: [cacheKey1, cacheKey2, cacheKey3, cacheKey4]

Step 2
One instance (A1-2) restarted
A1-1: [cacheKey1, cacheKey2, cacheKey3, cacheKey4]
A1-2: []

**Q1: is it correct always? Or A1-2 cache will be fully synced against A1-1 when using Redis as L2 cache? Or any other option?

I think you made a mistake in Step 1a: based on what you wrote I think A1-2 should be the cache with cacheKey4, right?

Apart from this: think about each local memory cache as on its own, they will have some data in them based on when they started and the request they received since then. If you add a distributed cache (L2), at every operation of the cache FusionCache will:

What I'm saying is don't try to picture it as a whole, because that's not how it works: it will get/set data at every operation as necessary.

Step 3 cacheKey2 was placed on A1-2 A1-1: [cacheKey1, cacheKey2, cacheKey3, cacheKey4] A1-2: [cacheKey2]

Q2: will the cache key be refreshed on A1-1 here?

In general yes.

In more details here's what happens when Set() is called on A1-2 (supposing there's a distributed cache and a backplane):

Makes sense?

Hope this helps.