sypark9646 / paper-logs

2022.10 ~
0 stars 0 forks source link

Twine: A Unified Cluster Management System for Shared Infrastructure #25

Open sypark9646 opened 1 year ago

sypark9646 commented 1 year ago

어떤 내용의 논문인가요? 👋

간략하게 어떤 내용의 논문인지 작성해 주세요! (짧게 1-2줄 이어도 좋아요!)

Abstract (요약) 🕵🏻‍♂️

We present Twine, Facebook's cluster management system which has been running in production for the past decade. Twine has helped convert our infrastructure from a collection of siloed pools of customized machines dedicated to individual workloads, into a large-scale shared infrastructure with fungible hardware. Our goal of ubiquitous shared infrastructure leads us to some decisions counter to common practices. For instance, rather than deploying an isolated control plane per cluster, Twine scales a single control plane to manage one million machines across all data centers in a geographic region and transparently move jobs across clusters. Twine accommodates workload-specific customization in shared infrastructure, and this approach further departs from common practices. The TaskControl API allows an application to collaborate with Twine to handle container lifecycle events, e.g., restarting a ZooKeeper deployment's followers first and its leader last during a rolling upgrade. Host profiles capture hardware and OS settings that workloads can tune to improve performance and reliability; Twine dynamically allocates machines to workloads and switches host profiles accordingly. Finally, going against the conventional wisdom of prioritizing stacking workloads on big machines to increase utilization, we universally deploy power-efficient small machines outfit with a single CPU and 64GB RAM to achieve higher performance per watt, and we leverage autoscaling to improve machine utilization.

We describe the design of Twine and share our experience in migrating Facebook's workloads onto shared infrastructure.

이 논문을 읽어서 무엇을 배울 수 있는지 알려주세요! 🤔

이 논문을 제대로 읽었을 때 어떤 지식을 얻을 수 있을까요?

같이 읽어보면 좋을 만한 글이나 이슈가 있을까요?

만약에 있다면 자유롭게 작성해 주세요!

레퍼런스의 URL을 알려주세요! 🔗

markdown 으로 축약하지 말고, 원본 링크 그대로 그냥 적어주세요!