dotnet / extensions

This repository contains a suite of libraries that provide facilities commonly needed when creating production-ready applications.
MIT License
2.62k stars 751 forks source link

[API Proposal]: Cache Synchronization for Hybrid Caching in Multi-Node Environments #5517

Open IbrahimMNada opened 2 days ago

IbrahimMNada commented 2 days ago

Background and motivation

In a multi-replica environment utilizing hybrid caching (in-memory and out-of-process), cache desynchronization between nodes can occur because there is no built-in mechanism to synchronize in-memory caches across nodes behind a load balancer. This results in inconsistent cache states, reducing the reliability of the system.

This proposal addresses the problem by introducing an event-driven mechanism to ensure cache synchronization across nodes.

Problem Context

Hybrid caching involves two main components:

  1. Out-of-process cache: This ensures a single source of truth, making cache invalidation simple and effective across nodes.
  2. In-memory cache: While useful for quick access, it poses challenges in multi-node environments due to the lack of cross-node communication when caches are reset.

When a cache is reset in one node, other nodes do not get notified, leading to cache desynchronization across the system.

Problem Statement

The current hybrid caching model does not offer a built-in mechanism to notify all nodes about an in-memory cache reset, resulting in inconsistent cache states between nodes in a multi-node environment.

API Proposal

Proposed Solution

Overview

Introduce a Publisher-Subscriber model using webhooks, event queues, or other notification mechanisms to propagate cache reset events to all nodes using the hybrid cache. This model will allow one node (the Publisher) to notify other nodes (Subscribers) when a cache reset happens, ensuring synchronization of the in-memory cache across all nodes.

Key Features

  1. Webhook-based/Callback mechanism: Each node registers as a Subscriber to receive notifications when cache resets happen. The node initiating the reset acts as the Publisher.

  2. Retry strategy: In case of a failure in notifying a node, the system retries the notification, ensuring robustness in cache synchronization.

  3. Multi-provider support: While webhooks are the default, the design allows support for other messaging systems like event queues, SignalR, etc.

API Changes

  1. Add a CacheResetNotification class:

    • Encapsulates the logic for broadcasting cache reset events to other nodes.
    
    public class CacheResetNotification
    {
       public void NotifyAllSubscribers(string cacheKey);
       public void RegisterSubscriber(IList<Uri> subscriberUris);
       public void UnregisterSubscriber(IList<Uri> subscriberUris);
    }

API Usage

services.AddHybridCache(options => 
{
    options.UsePublisherSubscriberModel()
           .AddWebhookSubscriber(uri => new IList<Uri> ( {Uri("https://node1/reset")}));
});

Alternative Designs

Polling Mechanism: This was dismissed due to inefficiency and increased load on the nodes.

Risks

We need to consider network flooding or over requesting between nodes , so we can define a sync period between nodes/keys count thresh hold or something like this

rsalus commented 22 hours ago

This would be a nice feature, but I'd prefer to use something like Redis pub/sub directly assuming that's what we are using for the L2 cache.

IbrahimMNada commented 10 hours ago

But like this it would make every one coupled with Redis,

I mean we could have it as an optional provider , but we need some default with no extra setup that's why I suggested web-hooks

We could have multiple providers like but not limited to 1-redis pub/sub 2-Event buses 3-Event signal R hub

IbrahimMNada commented 10 hours ago

Hello , @mgravell

could you please give us your invaluable input on this ? to be honest I'm eager to help