[FEATURE] 📛 Named Caches

jodydonetti commented 1 year ago

Scenario

Say we want to cache different types of data with different settings. We can already do this with FusionCache, since for any cache entry we can specify all the options we want, so we can change them based on the object type we are about to cache.

Easy peasy.

Problem

It would be nice though if we could be able to actually "split" different types of data into different caches or, even more, to have different configuration for the cache in a more advanced way, like for different types of data we would like to use different distributed caches, different serializers, different plugins, different backplane, etc.

A recent request by the community member @aKzenT basically asked for this, on top of other requests in the last year to have a concept similar to cache regions.

Again, this is already possible today by simply creating multiple cache instances, and since app caches are almost by definition singletons, we may simply declare a static class and put some FusionCache instances there, something like this:

public static class MyCaches {
  public static FusionCache FooCache { get; set; }
  public static FusionCache BarCache { get; set; }
}

Then, later on in our code, we can refer to them like this:

public void Mymethod() {
  var data = MyCaches.FooCache.GetOrDefault<int>("42");
}

All of this would work, of course, but if we then throw DI (Dependency Injection) into the mix it becomes harder to do that, at least natively.

So wouldn't it be nice to have native support, directly with FusionCache without extra code, to support multiple named caches with full DI support?

Solution

A similar problem has been already solved by Microsoft regarding HTTP clients, and specifically via the Named Clients approach.

Since that pattern is pretty well known and it seems to work well, FusionCache can take the same path so that FusionCache users can feel at home in the .NET ecosystem, makes sense?

Design proposal

A new overload of the AddFusionCache(...) method will be created, with a string cacheName param as the first one.

public void ConfigureServices(IServiceCollection services)
{
  services.AddFusionCache("Products", opt => { ... });
  services.AddFusionCache("Orders", opt => { ... });
}

This will also use the named options approach, in case that's needed.

Also, a new IFusionCacheProvider interface will be created, which will have a method GetCache(string cacheName) that will return a FusionCache instance by name (paired with the cache name we registered it with).

This interface then may be used in our classes' constructor (instead of directly using an IFusionCache param) and the used to ask for a cache by name, like so:

And later on:

public class HomeController : Controller
{
  IFusionCacheProvider _cacheProvider;

  public HomeController(IFusionCacheProvider cacheProvider)
  {
    _cacheProvider = cacheProvider;
  }

  public IActionResult Product(int id)
  {
    _cacheProvider.GetCache("Products").GetOrDefault<Product>($"product:{id}");
    // ...
  }
}

Any feedback would be appreciated!

jodydonetti commented 1 year ago

I also just pushed the feature branch, in case someone want to play with it.

It contains the whole implementation, along with some nice extra extension methods and all the tests ready to check.

If anyone is interested in this, I'd like opinions.

Thanks!

aKzenT commented 1 year ago

I took a look at the feature branch code. The implementation is almost identical to what I have implemented myself to achieve this, so I'm quite happy with the design. Some minor remarks and suggestions:

You are using IOptionsSnapshot to get the named options. My understanding is that IOptionsSnapshot is a scoped service and it is best practice to avoid singleton services (FusionCache) to have dependencies on scoped services (shorter lifetime). I have used IOptionsMonitor instead, which is singleton.
I think "IFusionCacheResolver" would be a better name than "IFusionCacheProvider" in the context of DI, but maybe that is just a matter of taste.
I really dislike the "useDistributedCacheIfAvailable" flag as it just silently fails if it is missing a dependency like the serializer and then you are running your app suddenly without a distributed cache. It's true that there is a log message, but that is easy to miss. It happened to me, when I first started using FusionCache and I can imagine I'm not the only one. I realize you inherited that flags from the existing method, but now might be a good time to change that design.

For my own version I have implemented a builder pattern for the registration that allows a syntax like:

serviceCollection
    .AddFusionCacheGroup("CMS")
    .EnableDistributedCache(new FusionCacheNewtonsoftJsonSerializer(
        new JsonSerializerSettings
        {
            PreserveReferencesHandling = PreserveReferencesHandling.Objects,
        }))
    .EnableDistributedCacheFlush(TimeSpan.FromHours(8));

By returning an instance of a simple helper class that looks like this:

public class FusionCacheGroupServiceConfigurationHelper
{
    /// <summary>
    /// Enables registration of additional setup actions through extension methods.
    /// </summary>
    public Action<IServiceProvider, ZiggyCreatures.Caching.Fusion.FusionCache> SetupCacheAction { get; set; } = (_, _) => { };
}

Extension methods can then be used to setup the specific cache instance:

    public static FusionCacheGroupServiceConfigurationHelper EnableDistributedCache(this FusionCacheGroupServiceConfigurationHelper helper, Func<IServiceProvider, IFusionCacheSerializer> setupSerializer)
    {
        helper.SetupCacheAction += (provider, cache) => cache.SetupDistributedCache(provider.GetRequiredService<IDistributedCache>(), setupSerializer(provider));

        return helper;
    }

I think this approach is a bit more user friendly than the setup action in the method as the latter requires the users to identify and to setup the dependencies for themselves. In other words: For the service collection itself you already have helper methods like AddFusionCacheNewtonsoftJsonSerializer and there should be an equivalent for this for a named cache. IIRC HttpClientFactory allows something similar.

Great work! Glad to see this being implemented.

jodydonetti commented 1 year ago

I took a look at the feature branch code. The implementation is almost identical to what I have implemented myself to achieve this, so I'm quite happy with the design.

Cool 😎

Some minor remarks and suggestions:

You are using IOptionsSnapshot to get the named options. My understanding is that IOptionsSnapshot is a scoped service and it is best practice to avoid singleton services (FusionCache) to have dependencies on scoped services (shorter lifetime). I have used IOptionsMonitor instead, which is singleton.

Dang it, you're right, my bad! That was my first implementation attempt.

I'll change that.

I really dislike the "useDistributedCacheIfAvailable" flag as it just silently fails if it is missing a dependency like the serializer and then you are running your app suddenly without a distributed cache.

On one hand I get your point, on the other hand though: 1) it would not "suddenly" disappear, since that would require a change in the distributed cache and/or in the serializer for it to disappear, which is not something you do constantly. I would more likely be there while you set it up first, or not, and if not the log may help you understand why. But again, I understand your point of view, and I'll think about it 2) without the useDistributedCacheIfAvailable param, FusionCache has to either not automatically pick up a distributed cache, or always pick it up, including the so called MemoryDistributedCache which is automatically injected into an mvc project and which is not actually a distributed cache

It's true that there is a log message, but that is easy to miss. It happened to me, when I first started using FusionCache and I can imagine I'm not the only one.

I thought about other ways to handle that, like an exception, but that seemed a little excessive. I'll think about it again, and see if I can come up with a better way.

I realize you inherited that flags from the existing method, but now might be a good time to change that design.

For my own version I have implemented a builder pattern for the registration that allows a syntax like:

serviceCollection
    .AddFusionCacheGroup("CMS")
    .EnableDistributedCache(new FusionCacheNewtonsoftJsonSerializer(
        new JsonSerializerSettings
        {
            PreserveReferencesHandling = PreserveReferencesHandling.Objects,
        }))
    .EnableDistributedCacheFlush(TimeSpan.FromHours(8));

By returning an instance of a simple helper class that looks like this:

public class FusionCacheGroupServiceConfigurationHelper
{
    /// <summary>
    /// Enables registration of additional setup actions through extension methods.
    /// </summary>
    public Action<IServiceProvider, ZiggyCreatures.Caching.Fusion.FusionCache> SetupCacheAction { get; set; } = (_, _) => { };
}

Extension methods can then be used to setup the specific cache instance:

    public static FusionCacheGroupServiceConfigurationHelper EnableDistributedCache(this FusionCacheGroupServiceConfigurationHelper helper, Func<IServiceProvider, IFusionCacheSerializer> setupSerializer)
    {
        helper.SetupCacheAction += (provider, cache) => cache.SetupDistributedCache(provider.GetRequiredService<IDistributedCache>(), setupSerializer(provider));

        return helper;
    }

Honestly? Apart from the "group" naming, I really like the idea of a builder pattern, and wanted to play with it for some time because the current ext method signature is not very extensible and evolvable. This may be a good time to play with such approach, thanks for pointing that out!

I'll play with it and prepare an issue to keep track of the design and impl.

Great work! Glad to see this being implemented.

Yes, I'm happy too, it was in the backlog for too much time now 🎉

aKzenT commented 1 year ago

Hey, the "group" naming was just copied from my own implementation.

Regarding the useDistributedCacheIfAvailable, I would argue that the target group of people that have IDistributedCache (with something other than MemoryDistributedCache) but do not want to use this distributed cache in the FusionCache is likely quite small. Throwing an error here helps save time for the people that want to use distributed cache but forgot to register a serializer. And even for the first group of people it is probably better to throw an exception, because they don't want to use distributed cache but forgot to make that explicit by setting the parameter to false.

So my suggestion would be: Make using distributed cache the default. Throw an exception, when the serializer is missing or MemoryDistributedCache is being used and then allow an override using parameters or the new method builder pattern to allow usage without distributed cache.

jodydonetti commented 1 year ago

Hi @aKzenT , I just pushed an update to the same feature branch, which now also contains the builder pattern support.

I'm tracking the design and development in another issue here just to keep things clean, let me know what you think.

Thanks!

jasenf commented 1 year ago

Would this implementation work as a tenant-specific cache?

So if we had 10,000 active customers and wanted each to have their own isolated cache, this would be an efficient way to pull it off?

jodydonetti commented 1 year ago

Hi @jasenf , sorry for the delay I missed this.

Would this implementation work as a tenant-specific cache? So if we had 10,000 active customers and wanted each to have their own isolated cache, this would be an efficient way to pull it off?

Theoretically yes, BUT since I assume the tenants would not be all known at startup (and they hopefully grow over time while the app is running) I would take a different approach, since trying to register them all at startup with something like AddFusionCache($"tenant-{tenantId}", ...) would definitely be a bad approach.

Instead I rather go with something like this:

public class TenantCacheProvider
{
    private readonly Dictionary<string, IFusionCache> _caches = new Dictionary<string, IFusionCache>();
    private readonly object _mutex = new object();

    public IFusionCache CreateCacheForTenant(string tenantId)
    {
        var options = new FusionCacheOptions()
        {
            CacheName = $"Tenant_{tenantId}"
            // YOUR OPTIONS HERE
        };

        var cache = new FusionCache(options);

        // YOUR SETUP HERE
        cache.SetupBackplane(...);

        return cache;
    }

    public IFusionCache GetCacheForTenant(string tenantId)
    {
        IFusionCache cache;

        if (_caches.TryGetValue(tenantId, out cache))
            return cache;

        lock (_mutex)
        {
            if (_caches.TryGetValue(tenantId, out cache))
                return cache;

            cache = CreateCacheForTenant(tenantId);

            _caches[tenantId] = cache;
        }

        return cache;
    }
}

You can then register this into the DI container, or create an ITenantCacheProvider interface if you like, etc. You may also add a ctor with some dependencies to be injected via DI and so on. Regarding the FusionCache instantiation and setup, that may be some logic injected too, maybe via a lambda or a factory of your own or something,

Hope this helps.

jasenf commented 1 year ago

Correct me if I'm wrong but this approach doesnt really work in a distributed cache because its still not handling any kind of real partitioning within the cache store (ie. like Redis). It looks like it kinda works for in-memory though I'm not sure having 10,000 cache instances makes a lot of sense.

Ultimately, we are still using EasyCaching because it has support for massive key deletes in a single call. We prepend all of our key values with the tenant ID and whenever something changes in a tenant we just invalidate that tenant using Delete($"{tenantId}_*") It works great.

It would be nicer if EasyCache or FusionCache were simply dealing with this on their own though. I understand your thought that the Cache shouldn't have anything to do with key generation but I'm not 100% sure I'm always in agreement (Databases and libraries like EntityFramework) happily deal with primary key generation and management). They also let you build Filters that are applied to all incoming queries so a WHERE TenantId=xx is added to all SQL.

I think it would be nice if a cache store let us define "tenants" (or whatever you want to call it), and it dealt with each tenant officially via parameters like Set(tenantId, key, value) then just prepended tenantId to each key. Behind the scene implementation of this isolation could be left to the store. InMemory would be some Dictionary of course, but Redis could just be a simple prepending of keys. Then a call like DeleteBatch() would allow us to invalidate a full tenant.

jodydonetti commented 1 year ago

Hi @jasenf ,

Correct me if I'm wrong but this approach doesnt really work in a distributed cache because its still not handling any kind of real partitioning within the cache store (ie. like Redis).

Well first, if I remember correctly, you were the one with the scenario of memory + backplane (without distributed), so I tried to create the example based on that and, so, no distributed cache.

Second: yes, it does, if you correctly configure the distributed cache to use the specific way of really splitting data of your desired distributed cache impl. For example in the Redis version there's the concept of a database index (no Redis cluster), in the SqlServer version there are schema name and table name, and so on. So by using those yes, you can totally get actual separation of data per each tenant.

It looks like it kinda works for in-memory though I'm not sure having 10,000 cache instances makes a lot of sense.

You asked for "each to have their own isolated cache" and I gave you that 🤷‍♂️

It would be nicer if EasyCache or FusionCache were simply dealing with this on their own though. I understand your thought that the Cache shouldn't have anything to do with key generation but I'm not 100% sure I'm always in agreement (Databases and libraries like EntityFramework) happily deal with primary key generation and management). They also let you build Filters that are applied to all incoming queries so a WHERE TenantId=xx is added to all SQL.

Regarding this whole cache prefix thing, take a look at this comment of mine from today, maybe you'll be interested.

Ultimately, we are still using EasyCaching because it has support for massive key deletes in a single call. We prepend all of our key values with the tenant ID and whenever something changes in a tenant we just invalidate that tenant using Delete($"{tenantId}_*") It works great.

I understand, and I would like massive methods too, but we've been there before: FusionCache has been designed to use the 2 common caching abstractions in .NET, IMemoryCache and IDistributedCache.

Because of this design decision, any available impl of those abstractions can be easily used without anyone having to explicitly create an implementation of a, say, custom IFusionCacheLayer or something like that.

Of course this has advantages - like the one mentioned above - but also disadvantages, like being limited to the features defined in those abstractions.

So, if there's no "delete by prefix" in those abstractions, that is a show stopper.

I think it would be nice if a cache store let us define "tenants" (or whatever you want to call it), and it dealt with each tenant officially via parameters like Set(tenantId, key, value) then just prepended tenantId to each key. Behind the scene implementation of this isolation could be left to the store. InMemory would be some Dictionary of course, but Redis could just be a simple prepending of keys. Then a call like DeleteBatch() would allow us to invalidate a full tenant.

Apart from the DeleteBatch() call you mention, the other parts are already possible with the current named caches + builder pattern + cache key prefix design. You can use the approach I mentioned a couple of comments above to handle different named caches, one per each tenant. All of these instances of FusionCache can be configured to use, for example, the same shared IMemoryCache and IDistributedCache instances and use some tenant-related cache key prefix to avoid collisions. Or, if you like, different cache instances as I mentioned above. Or a mix of these approaches. Of course, all of this would be true by simply creating cache keys that already include the tenant id, but I understand the appeal of defining a tenant/prefix once and have the cache deal with it.

Anyway, as said in this comment, FusionCache V2 may change that design decision. It's early and I'm almost ready for the V1, but it is already being experimented with.

jodydonetti commented 1 year ago

Hi all, I just released a v0.20.0-preview1 🎉

jodydonetti commented 1 year ago

Hi all, I just release v0.20.0-preview2 🎉

Unless some problems will came up this week, the next weekend I'll release the final v0.20.0.

jodydonetti commented 1 year ago

Hi everyone, I've finally released the new v0.20.0 🎉

ZiggyCreatures / FusionCache