tapaswenipathak / linux-kernel-stats

linux kernel stats (Publication [Journal, Magazine]). This repository has code files.
MIT License
3 stars 8 forks source link

Summary - Auto-Sharding for Datacenter Applications #105

Closed PankajGoswami11 closed 1 year ago

PankajGoswami11 commented 1 year ago

https://talkgadget.google.com/linkredirect?dest=https%3A%2F%2Fwww.usenix.org%2Fsystem%2Ffiles%2Fconference%2Fosdi16%2Fosdi16-adya.pdf

PankajGoswami11 commented 1 year ago

Slicer: Auto-Sharding for Datacenter Applications

Introduction Slicer is a general purpose sharding service developed by Google. It aims to provide an easy-to-use sharding solution for large scale applications. It dynamically monitors load hotspots and server health to distribute work evenly across a set of servers while maintaining high availability and minimizing churn. Slicer separates concerns into a data plane to forward requests and a control plane to make load-balancing decisions, allowing it to achieve the consistency and optimization of centralized sharding with the scalability and low latency of locally decision-making systems. Slicer's API has been adopted in many Google applications and is used for various purposes such as resource allocation, write coalescence, and increasing cache efficiency. It currently handles up to 7 million requests per second in production traffic and has shown to use 63% fewer resources compared to static sharding.

Slicer overview and api Slicer is a sharding service that distributes the workload of an application across a set of tasks within a datacenter. The unit of sharding in Slicer is a key chosen by the application. Slicer has three components: a centralized Slicer Service, a library linked into application clients (Clerk), and a library linked into application server tasks (Slicelet). The Slicer Service generates an assignment mapping key ranges (slices) to tasks and distributes it to the subscribers (Clerk and Slicelet). The Clerk directs client requests for a key to the assigned task and the Slicelet enables a task to learn when it is assigned or relieved of a slice. The Slicer Service monitors load and task availability to generate new assignments to maintain availability of all keys. Application code interacts with the Slicer Service via the Clerk and Slicelet libraries. The Service is written in Java, and the libraries are available in C++, Java, and Go.

Sharding model Slicer is a load balancing tool used to manage state placement in distributed systems. It assigns application keys, either fine-grained or coarse-grained, to task replicas. The keys are hashed into 63-bit slice keys, making the workload independent of the number of keys and allowing for new keys to be created without affecting the performance of the system. The cost of this is lost locality, as contiguous keys are scattered. Some applications require all requests for the same key to be served by the same task, for which Slicer offers a consistency guarantee. Others, with weaker consistency requirements, can use key redundancy, allowing each slice to be assigned to multiple tasks. Slicer has a minimum redundancy requirement to protect availability and automatically increases replication for hot slices.

Slicelet Interface The Slicer is a sharding service that splits an application's work across tasks to balance load within a datacenter. It consists of the Slicer Service, the Clerk, and the Slicelet. The Slicer Service generates assignments of key ranges to tasks, which are then distributed to the Clerk and the Slicelet. The application server task interacts with the Slicer Service via the Slicelet API, which provides information about slice assignments and allows for reporting custom load metrics. The Clerk API maps keys to addresses of assigned tasks and integrates with Stubby and GFE. Stubby is an RPC system that accepts an additional slice key argument and directs the RPC to the selected task using Slicer's assignment. The GFE is an HTTP proxy that routes requests to internal tasks, with Slicer integration interpreting slice keys from request features.

Slicer is a system used for efficient management of in-memory dynamic caches for various services. It is used for various purposes such as managing meeting schedules, caching user contacts and metadata, analyzing stored data, managing public API metrics and logs, assigning languages in a speech recognition system, and assigning DNS records. Slicer works by assigning keys to tasks and routing incoming requests to the task with the required information. Its asymmetric key replication spreads hot key traffic across multiple tasks, improving efficiency and avoiding bottlenecks. Slicer is used by various systems such as Flywheel, a mobile optimization HTTP proxy, Google's Cloud DNS service, and a pubsub system called Client Push.

In-memory Store Applications In-memory store applications are systems that use an in-memory cache to store data, which allows for faster access to the stored data. Slicer is used to assign the data to tasks and route incoming requests to the appropriate task. This approach reduces manual overhead and provides quick local decisions. Two examples of in-memory store applications are speech recognition and Google's Cloud DNS service. In speech recognition, Slicer is used to assign languages to tasks, while in Cloud DNS, Slicer assigns DNS records to tasks. The use of Slicer allows for quick local decisions and reduces manual overhead.

Slicer Service Implementation Slicer is a backend service that balances the high-quality, consistent sharding decisions of a centralized system with the scalability, low latency, and fault tolerance of local decisions. The core of Slicer's backend service is the Assigner, which collects health, task provisioning, and load signals to produce a strongly consistent assignment of work to tasks. Slicer's implementation is highly distributed, with client-side caching, Distributors, and Backup Distributors that provide a backstop against failures. The Assigner uses a sharding algorithm to generate assignments, which are written into optimistically-consistent storage. Only a single preferred Assigner generates an assignment for a job at a time. Assignments are distributed to subscribers through a two-tier distribution tree, where subscribers ask Distributors for assignments and the Distributors ask the Assigner if they don't have it. Slicer is designed to maintain request routing despite failures and has a backup assignment retrieval path in case of failures.

Load Balancing The ultimate goal of load balancing is to distribute the workload evenly among available tasks, so that the system can handle unexpected traffic surges without being overwhelmed. The load balancing algorithm Slicer uses is based on minimizing the load imbalance, which is defined as the ratio of the maximum task load to the mean task load. Slicer uses key load metrics and key range representation to determine when resharding is necessary, and proceeds through five phases to produce a new assignment: 1) reassigning keys from failed tasks, 2) adjusting key redundancy, 3) merging adjacent cold slices, 4) making moves with the highest weight (the reduction in load imbalance divided by the key churn), and 5) splitting hot slices. The algorithm picks the best moves by considering only the hottest task in each iteration and making reassignments, redundant assignments, or removal of slices. Constants in the algorithm were chosen by observing existing applications and are not very sensitive.

Strong Consistency The Slicer application uses a combination of Chubby locks and its own implementation to maintain data consistency and strong consistency guarantees for keys. A central lease manager is not used as it would require a large amount of resources and be difficult to scale to the level required by Slicer. Instead, Slicer uses three Chubby locks per job to provide a scalable lease-per-key abstraction. The Assigner acquires the exclusive job lease to change the assignments and writes the assignment generation number as the guard lease. The Slicelets acquire the guard lease for reading to ensure consistency. During assignment changes, unchanged slices are still available with the help of a bridge lease, reducing the application unavailability period. The consistent-assignment mechanism can enforce the policy of at most one Slicelet per key, or even a more complex policy, but it would require coordination by the application.

Evaluation: Clients' distribution shifts every 19 minutes to move the hottest load to different keys. The median delay for clients to shift load to tasks with a max/mean load imbalance below 1.2 is reported to be 480 seconds, which is a result of the 1 minute delay from the Google monitoring system and Slicer's 5 minute load observation window. One observation window may not be sufficient as it does not allow Slicer to completely restore balance unless the load shift occurs very early in the window. Slicer's algorithm (weighted-move with redundancy) outperforms other algorithms in reducing load imbalance, with lower key churn compared to consistent hashing but higher key churn compared to static sharding. The use of asymmetric replication provides significant load balancing benefits with a small increase in key churn due to increased opportunities to address imbalance. In a synthetic dynamically skewed workload with 25 clients driving 43 Kreq/s against 50 server tasks, Slicer showed that 99.85% of requests were satisfied, compared to 99.19% without bridging. The time for the "getSliceKeyHandle" operation was 153 µs and for "isAssignedContinuously" was 94 µs.

Related Work: Slicer uses a centralized algorithm for load balancing, which provides better load balancing and consistency guarantees compared to Orleans and Ringpop that use client-based consistent hashing. Service Fabric does not support dynamic sharding and is more invasive to applications than Slicer's small API.

Social Hash and Slicer both use a separation of coordinated central decision making from distributed forwarding. However, Social Hash optimizes placement based on inter-key locality in social graphs, while Slicer operates at a fine granularity in space and time and supports a wide variety of applications with consistent assignment.

Conclusion: The production deployment of Slicer has shown to meet its load balancing and availability goals. Real-world applications have a max:mean load ratio of 1.3 to 2.8, which helps with peak load capacity planning. Slicer provides better load balancing than load-aware consistent hashing, while creating less key churn. Slicer has a high availability, routing customer requests correctly at least 99.98% of the time, making it a building block for highly available applications. It has been adopted by over 20 projects with various use cases, showcasing its general API.

KavitaMeena23 commented 1 year ago

include section 3.2 in the summary

duttabhishek0 commented 1 year ago

@tapaswenipathak