mental2008 / awesome-papers

Here are my personal paper reading notes (including cloud computing, resource management, systems, machine learning, deep learning, and other interesting stuffs).
https://paper.lingyunyang.com/
MIT License
38 stars 2 forks source link

SOSP '21 | RAS: Continuously Optimized Region-Wide Datacenter Resource Allocation #18

Closed mental2008 closed 1 year ago

mental2008 commented 2 years ago

Presented in SOSP '21. [ Paper | Video ]

Authors: Andrew Newell, Dimitrios Skarlatos, Jingyuan Fan, Pavan Kumar, Maxim Khutornenko, Mayank Pundir, Yirui Zhang, Mingjun Zhang, Yuanlai Liu, Linh Le, Brendon Daugherty, Apurva Samudra, Prashasti Baid, James Kneeland, Igor Kabiljo, Dmitry Shchukin, Andre Rodrigues, Scott Michelson, Ben Christensen, Kaushik Veeraraghavan, and Chunqiang Tang Facebook, Carnegie Mellon University

mental2008 commented 2 years ago

Existing Problem

Capacity reservation in in public clouds and on-premise infrastructure. But no prior work provides capacity reservation with SLO guarantees.

Solution

The authors describe Facebook's region-scale Resource Allowance System (RAS), which has been running in production for almost two years.

The two-level architecture: image

RAS is a new component of Twine #45 (Facebook’s 10-year old cluster manager).