Learning Scheduling Algorithms for Data Processing Clusters - Githubissues

dyweb / papers-notebook

:page_facing_up: :cn: :page_with_curl: 论文阅读笔记（分布式系统、虚拟化、机器学习）Papers Notebook (Distributed System, Virtualization, Machine Learning)

https://github.com/dyweb/papers-notebook/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+-label%3ATODO-%E6%9C%AA%E8%AF%BB

Apache License 2.0

2.15k stars 251 forks source link

Learning Scheduling Algorithms for Data Processing Clusters #181

Open gaocegege opened 5 years ago

gaocegege commented 5 years ago

https://web.mit.edu/decima/content/sigcomm-2019.pdf https://web.mit.edu/decima/ https://github.com/hongzimao/decima-sim

利用了强化学习 + GNN 做 DAG 任务的调度

gaocegege commented 5 years ago

首先文章分析了利用强化学习进行任务调度的困难之处。第一点，任务的状态是很复杂的。比如一个 DAG 任务，可能有很多个 stage 组成。把这样的信息建模到神经网络中是有难度的。第二点是 action space 是很大的。因为可以采取的调度动作有很多。最后是按照一定的随机性模拟任务的到达对于强化学习来说，也非常难搞。

gaocegege commented 5 years ago

对于第一个问题，文章采取了 GCNN 来建模，这部分不太懂，所以先放着，回头对 GCNN 了解一些再来重新读。

gaocegege commented 5 years ago

对于第二个问题，作者为了避免过大的 action space 以及过长的 sequences，对空间搞了一个 hybrid 的解决思路。动作由两维组成，第一维度是需要被调度的 stage，第二维度是 stage 用到的 execitor 的上限。通过这样的方式，tradeoff 了一下 action space 和 sequences 的长度

gaocegege commented 5 years ago

第三个问题会造成两个问题（好绕啊），第一个就是因为一开始的时候调度会很差，任务可能被长期积压。这会导致很难快速推进训练。为了解决这个问题，作者搞了一个类似于早期停止策略的实现，同时为了 agent 学习到停止的 pattern，倾向于把难做的任务延后做，这一个策略停止的时间得是随机的，作者用指数分布来随机。

第二个问题就是任务到达的 pattern 会非常影响性能，这里作者用了最近的一个工作[1]来避免。没看过不太懂，回头还要继续学习一个

[1] Hongzi Mao, Shaileshh Bojja Venkatakrishnan, Malte Schwarzkopf, and Mohammad Alizadeh. 2019. Variance Reduction for Reinforcement Learning in Input-Driven Environments. Proceedings of the 7th International Conference on Learning Representations (ICLR) (2019).