The Power of Prediction: Microservice Auto Scaling via Workload Learning

sypark9646 commented 1 year ago

어떤 내용의 논문인가요? 👋

워크로드 예측을 통해 컨테이너 스케일링을 효율적으로 할 수 있는 방법에 관한 연구

Abstract (요약) 🕵🏻‍♂️

When deploying microservices in production clusters, it is critical to automatically scale containers to improve cluster utilization and ensure service level agreements (SLA). Although reactive scaling approaches work well for monolithic architectures, they are not necessarily suitable for microservice frameworks due to the long delay caused by complex microservice call chains. In contrast, existing proactive approaches leverage end-to-end performance prediction for scaling, but cannot effectively handle microservice multiplexing and dynamic microservice dependencies.

In this paper, we present Madu, a proactive microservice auto-scaler that scales containers based on predictions for individual microservices. Madu learns workload uncertainty to handle the highly dynamic dependency between microservices. Additionally, Madu adopts OS-level metrics to optimize resource usage while maintaining good control over scaling overhead. Experiments on large-scale deployments of microservices in Alibaba clusters show that the overall prediction accuracy of Madu can reach as high as 92.3% on average, which is 13% higher than the state-of-the-art approaches. Furthermore, experiments running real-world microservice benchmarks in a local cluster of 20 servers show that Madu can reduce the overall resource usage by 1.7X compared to reactive solutions, while reducing end-to-end service latency by 50%.

이 논문을 읽어서 무엇을 배울 수 있는지 알려주세요! 🤔

introduction: 마이크로서비스는 오버프로비저닝 되는 경향이 있다.
related works: auto-scaler
- proactive: 자원을 예측하여 부족해지기 전에 자원을 더 할당해주는 방식, K8S HPA, Google Autopilot[EuroSys'20]
- reactive: 자원 사용량이 부족해지면 늘리는 사후처리방식, DUBNN[NeurlPS’17] , BNN[NeurlPS’19]
method: uncertainty 개념을 워크로드를 정의할 때 사용했고워크로드의 uncertainty를 예측함으로써 마이크로서비스 성능도 함께 선형적으로 예측
experiment: 알리바바 마이크로서비스 트레이스와 벤치마크 테스트를 진행 -> MADU가 소규모, 대규모 워크로드에서 높은 정확도를 보였으며, 예측 시 레이턴시도 짧다는 장점
conclusion & discussion: 본 논문의 저자는 MADU가 전체 시스템에 대해 최적의 효율을 달성할 수 있으나, 동시에 원하는 서비스 수준 협약 SLA 목표를 충족할 수는 없다는 한계점을 언급했다.

같이 읽어보면 좋을 만한 글이나 이슈가 있을까요?

레퍼런스의 URL을 알려주세요! 🔗

markdown 으로 축약하지 말고, 원본 링크 그대로 그냥 적어주세요!

sypark9646 commented 1 year ago

power_of_prediction_microservice_auto_scaling_via_workload_learning.pdf

sypark9646 commented 1 year ago

워크로드 예측은 최근 몇 년 동안 활발한 연구 주제였습니다 [13, 33, 34, 38, 44, 45].

Long-term SLOs for reclaimed cloud computing resources: https://static.googleusercontent.com/media/research.google.com/ko//pubs/archive/43017.pdf
Query-based Workload Forecasting for Self-Driving Database Management Systems https://www.pdl.cmu.edu/PDL-FTP/Database/sigmod18-ma.pdf
OPTIMUSCLOUD: Heterogeneous Configuration Optimization for Distributed Databases in the Cloud https://www.usenix.org/conference/atc20/presentation/mahgoub
AGILE: elastic distributed resource scaling for Infrastructure-as-a-Service https://www.usenix.org/system/files/conference/icac13/icac13_nguyen.pdf
Autopilot: workload autoscaling at Google https://dl.acm.org/doi/pdf/10.1145/3342195.3387524
CloudScale: elastic resource scaling for multi-tenant cloud systems https://dl.acm.org/doi/pdf/10.1145/2038916.2038921

sypark9646 commented 1 year ago

마이크로서비스 관련 연구

SHOWAR: Right-Sizing And Efficient Scheduling of Microservices https://dl.acm.org/doi/pdf/10.1145/3472883.3486999
Sage: Practical & Scalable ML-Driven Performance Debugging in Microservices https://www.csl.cornell.edu/~delimitrou/papers/2021.asplos.sage.pdf
Practical Efficient Microservice Autoscaling with QoS Assurance https://dl.acm.org/doi/pdf/10.1145/3502181.3531460
GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks https://jeongseob.github.io/papers/kannan_eurosys19.pdf
Parslo: A Gradient Descent-based Approach for Near-optimal Partial SLO Allotment in Microservices https://dl.acm.org/doi/pdf/10.1145/3472883.3486985
FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices https://www.usenix.org/conference/osdi20/presentation/qiu
Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines https://web.stanford.edu/~faromero/llama.pdf
Sinan: ML-based and QoS-aware resource management for cloud microservices https://dl.acm.org/doi/pdf/10.1145/3445814.3446693
Rhythm: Component-distinguishable Workload Deployment in Datacenters https://dl.acm.org/doi/pdf/10.1145/3342195.3387534

sypark9646 / paper-logs