hongzimao / decima-sim

Learning Scheduling Algorithms for Data Processing Clusters
https://web.mit.edu/decima/
286 stars 90 forks source link

What about cross-server data transmission overhead? #29

Open hliangzhao opened 3 years ago

hliangzhao commented 3 years ago

Sorry to bother you again.

In my research area, each stage is scheduled to be placed on some VM node. If its child stages are placed on different VM nodes, cross-node data transmission overhead should be considered. Thus, minimize the makespan can be divided into two subgoals, the execution time and the cross-node communication overhead.

But I found that Decima does not consider the transmission time of intermediate data between the fore-and-aft stages of each job. Is this because the scheduling environment is Spark? Or all the jobs are running on the same "VM node"?

hongzimao commented 3 years ago

I agree that data locality is an important aspect to optimize. Our simulator didn't capture it explicitly because the particular workload we run on Spark did not show much difference (all VMs are in a single datacenter, where the large network throughput makes this locality issue minimum).

However, I would say it shouldn't be hard to add the transmission time in the simulator. You can create a multiplier on the task run time based on parent and child node.

Also, for RL, you might want to still optimize directly for the end-objective as opposed to divide the goal into sub-goals and optimize them individually. It might be difficult to hand-tune the balance between execution time and cross-node communication overhead.

Hope these help!

hliangzhao commented 3 years ago

Thanks! This helps a lot!