BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
13.76k stars 1.62k forks source link

LiteLLM does redis sentinel support? #4381

Closed Semihal closed 1 month ago

Semihal commented 4 months ago

LiteLLM does redis sentinel support? Or only the sharded version?

krrishdholakia commented 2 months ago

hey @Semihal @acuciureanu what's the difference between redis sentinel and redis cluster?

Adding support for redis cluster now - https://github.com/BerriAI/litellm/pull/5325

acuciureanu commented 2 months ago

hey @Semihal @acuciureanu what's the difference between redis sentinel and redis cluster?

Adding support for redis cluster now - #5325

Redis Sentinel

Redis Sentinel provides high availability for Redis deployments through a system of monitoring, notification, and automatic failover. This solution operates by continuously checking the health of master and replica nodes. Upon detecting a master failure, Sentinel initiates an automatic failover process, promoting a replica to master status.

Sentinel maintains a single-master architecture, which means all write operations are directed to one node. While this ensures strong consistency, it can potentially create a performance bottleneck as the system scales. Data isn't sharded across multiple nodes, limiting the total dataset size to the capacity of a single server.

Sentinel excels in scenarios where reliable failover is crucial, but the data volume and throughput requirements can be met by a single master node. It's particularly well-suited for applications that prioritize data consistency and simpler operational models over extreme scalability.

Redis Cluster

Redis Cluster is engineered for horizontal scalability, employing a distributed architecture that automatically partitions data across multiple nodes. This design allows Redis Cluster to handle significantly larger datasets and higher throughput compared to a single-instance Redis deployment.

The cluster uses a hash slot mechanism to distribute keys across nodes. Each of the 16384 hash slots in a Redis Cluster can be assigned to different nodes, allowing for fine-grained data distribution. This approach enables the cluster to scale out by adding more nodes and redistributing hash slots.

Redis Cluster integrates high availability features directly into its architecture. Each master node can have multiple replica nodes, and the cluster can automatically failover to a replica if a master fails. This eliminates the need for an external monitoring system like Sentinel.

The distributed nature of Redis Cluster makes it ideal for applications dealing with large-scale data and requiring high write throughput. However, it introduces a level of eventual consistency, as data updates may take some time to propagate across all nodes.

Key Technical Differences

  1. Architecture Complexity: Sentinel operates on top of standard Redis instances, adding a layer for monitoring and failover. Cluster, on the other hand, fundamentally changes Redis's architecture, requiring cluster-aware Redis nodes and clients.

  2. Scalability Mechanism: Sentinel doesn't provide data sharding, limiting scalability to vertical scaling of the master node. Cluster implements horizontal scalability through data sharding across multiple master nodes.

  3. Client-Side Logic: With Sentinel, clients interact with sentinel nodes to discover the current master, then connect directly to Redis nodes. Cluster requires clients to understand the cluster topology and key distribution, often necessitating client-side caching of this information.

  4. Data Consistency Model: Sentinel setups, with a single master, provide strong consistency for all operations. Cluster, due to its distributed nature, operates under an eventual consistency model, particularly during resharding or failover events.

  5. Network Partition Resilience: Cluster has more sophisticated partition tolerance, able to continue partial operations during network splits, whereas Sentinel may struggle with split-brain scenarios.

  6. Resource Overhead: Cluster nodes carry additional memory overhead for maintaining cluster state and key distribution information. Sentinel has minimal overhead on Redis nodes themselves, with the sentinel processes being relatively lightweight.

  7. Replication Architecture: Both support replication, but Cluster integrates it more deeply. In Cluster, replicas are integral to the system's scalability and availability model, while in Sentinel, they primarily serve as failover candidates.

  8. Use Case Alignment: Sentinel is often the choice for applications requiring strong consistency with moderate scalability needs. Cluster is preferable for use cases demanding massive scalability, high write throughput, or handling datasets that exceed single-node capacity.