Closed mental2008 closed 1 year ago
Capacity reservation in in public clouds and on-premise infrastructure. But no prior work provides capacity reservation with SLO guarantees.
The authors describe Facebook's region-scale Resource Allowance System (RAS), which has been running in production for almost two years.
The two-level architecture:
RAS is a new component of Twine #45 (Facebook’s 10-year old cluster manager).
Presented in SOSP '21. [ Paper | Video ]
Authors: Andrew Newell, Dimitrios Skarlatos, Jingyuan Fan, Pavan Kumar, Maxim Khutornenko, Mayank Pundir, Yirui Zhang, Mingjun Zhang, Yuanlai Liu, Linh Le, Brendon Daugherty, Apurva Samudra, Prashasti Baid, James Kneeland, Igor Kabiljo, Dmitry Shchukin, Andre Rodrigues, Scott Michelson, Ben Christensen, Kaushik Veeraraghavan, and Chunqiang Tang Facebook, Carnegie Mellon University