hongzimao / decima-sim

Learning Scheduling Algorithms for Data Processing Clusters
https://web.mit.edu/decima/
286 stars 90 forks source link

Is scheduling taken as a Markovian Process which allows Reinforcement Learning to be used? #9

Closed kauray closed 4 years ago

kauray commented 4 years ago

Decima uses a Reinforcement Learning framework. In reinforcement learning, as mentioned in the appendix, 'The state transitions and rewards are stochastic and assumed to be a Markov process'. However in the Introduction it is mentioned, 'Decima uses existing monitoring information and past workload logs to automatically learn sophisti-cated scheduling policies'

I had two questions : Is scheduling taken as a Markovian Process which allows Reinforcement Learning to be used? Does the neural network based design have something to do with it?

hongzimao commented 4 years ago

Thanks for your interest! Yes we formulate the scheduling problem as a Markov decision process (MDP). Section 5.2 has more details of how we design the scheduling event and scheduling action to construct an easy MDP for the learning agent to train.

The neural network (NN) design is for processing the state information. Section 5.1 talks about how we use graph neural network to process job information embedded in computation graphs with arbitrary shape and size. NN and MDP are two different things — you can think of NN as information processing tool and MDP as problem formulation.

Hope this helps.