ganler / ResearchReading

General system research material (not limited to paper) reading notes.
GNU General Public License v3.0
20 stars 1 forks source link

[MLOS Seminar] Systems and ML: Opportunities and Challenges for Symbiotic Research #33

Closed ganler closed 4 years ago

ganler commented 4 years ago

Shiv: https://www.youtube.com/watch?v=t-ClkgN2RY0&feature=youtu.be Seminar: https://remziarpacidusseau.wixsite.com/mlos

ganler commented 4 years ago

Shiv's Talk

Too Many Knobs to Tune? Towards Faster Database Tuning by Pre-selecting Important Knobs Konstantinos Kanellis, Ramnatthan Alagappan, and Shivaram Venkataraman https://www.usenix.org/system/files/hotstorage20_paper_kanellis.pdf

ML for DB configuration tuning:

ML Systems: DL workloads scheduling

Themis: Fair scheduling.

The key features of DL workloads:

Sharing incentive(SI):

The worst performance of N devices sharing one public resource [should not be less than] that of one device owning 1/N private resource.

Interface Get \rho estimates via Agent

Metric: Fairness = Tsh/Tid

Strawman Mechanism: I didn't quite understand how they actually operate in this step... Maybe I should look at the paper...

Observations: Avg work hours = 3.7 with most app 5X longer and 5X shorter.

Other systems: DRF: Allocate on task completion to Min Metric(No preemption). Short tasks may wait for long-term jobs for a long time. Tiresias: Metric: GPU allocated * time; Allocate resources to those with MIN metric. DRAWBACK: ignores locality.