camsas / firmament

The Firmament cluster scheduling platform
Apache License 2.0
415 stars 79 forks source link

How to repeat the experiments in OSDI 2016 paper using Firmament #61

Closed lilelr closed 6 years ago

lilelr commented 6 years ago

Hello, @ms705.

I read your OSDI 2016 paper “Firmament:fast, centralised cluster scheduling at scale”. I plan to replay the experiments in your paper, but I was confused by the following questions.

1 The first experiment I want to replay is the experiment described in Figure 7 in your paper. The experiment is to get average runtime for MCMF algorithms on clusters of different sizes subsampled from the Google trace. However, I cannot find the algorithm relaxation codes from the source code of Firmament. The CS2 software from the third-party directory seems to implement the Cost scaling algorithm. And the flowlessly software support Cycle cancelling and Successive shortest path algorithms. But the following codes in the “solver_dispatcher.cc” shows that maybe I could designate the relax algorithm to run firmament. snip20170922_1

I try to run the simulator with the following command “-flowlessly_algorithm=relax” and it failed. snip20170922_4

The warning log showed that:

snip20170922_2

So where is the codes of relaxation algorithm? And how to designate and distinguish the incremental cost scaling and Quincy’s cost scaling through command parameters? Another question is how to designate the cluster size when running the simulator of Firmament? I would appreciate much if you could give me the sample of executing commands when running a simulation as to how to replay the experiment of Figure 7.

2 The second experiment is of Figure 8. Is the simulated Google cluster from the Google trace(12,500 machines)? And how to push the simulated Google cluster closer to oversubscription like 98%?

3 The third experiment is of Figure 9 in your paper. It describes contention slows down the relaxation algorithm. The cluster you used is about 40 machines or is exactly the same simulated cluster as that in Figure 8? And how to create a single job with an increasing number of task ranging from 1000 to 5000?

ICGog commented 6 years ago

Hi @lilelr,

It's great to hear that you're trying to replicate our experiments. Hopefully, the following answers are going to help you.

1.a. Firmament uses the Flowlessly min-cost flow solver which implements the different min-cost flow algorithms. However, only some of the algorithms (cycle canceling and successive shortest path are publicly available), but we can make the others available to you with an academic license so that you can replicate our experiments.

1.b. All our experiment configuration files are located here . For figure 7 we used this type of configurations, where the --events_fraction=0.068 flag means that were only maintaining 6.8% of the events and machines, effectively scaling the 12,543-machine cluster to a cluster of 852 machines.

  1. In the Google trace there is no information about how many non-normalised resources machines have, so we took the liberty to model different types of machine topologies (i.e., we used machines with 16 PUs, 24PUs). Simultaneously, the Quincy policy is a slot-based policy, which means that we can control slot-utilisation by reducing or increasing the number of slots we allocate to machines. In our experiment, we allocated one slot per machine PU, we measure how many tasks were running at a given time in the cluster (around 160-180k), and we adjusted the number of PUs per machine to get a high cluster utilisation. For example, in this configuration file, we set machines to have 14PUs, which means the cluster has 14*12543=175,602slots and that there are going to be moments of oversubscription.

  2. In this experiment, we used the same simulated Google cluster, but instead we submitted synthetic jobs with n tasks. Here an example configuration. With this configuration, Firmament simulates a 12,543-machine cluster, with the workload from time 0 from the Google trace, and then submits a synthetic 30,000 tasks jobs to the cluster.

Let me know if you have additional questions.

lilelr commented 6 years ago

@ICGog, thank you very much for your kind answers. As you mentioned, I need the codes of fast_cost_scaling and relaxation algorithms to replicate your experiments. So how could I get a complete version of Flowlessly with all min-cost flow algorithms? If there is any application form of academic research I have to fill, please send it to my email lilelr@163.com.

If there are other routes to get a complete version of Flowlessy, please tell me. Thanks again.

lilelr commented 6 years ago

@ICGog , @ms705 , I used your experiment configuration files to replicate the experiment of Figure 7, the result as the following picture shows is different from that in your paper. snip20171102_2 Cycle canceling and successive shortest path are from The Flowlessly and Cost scaling 2 is from CS2 software from the third-party directory in Firmament, while relaxIV is the fourth version of relaxation algorithm written by C++.

Compared to Figure 7 in your paper, my result using cs2 seems better but the result of relaxIV is terrible. snip20171102_4 My questions is: Is the cs2.exe you provides in the source codes of Firmament available usable? And which version of relaxation algorithm do you use in Firmament? Could you provide the other min-cost flow algorithms for me via my email lilelr@163.com or ll.li@siat.ac.cn. I am a graduate student from the cloud computing lab of SIAT(Shenzhen Institute of Advanced Technology, Chinese Academy of Science). All my work belongs to academic research.

ms705 commented 6 years ago

Hi @lilelr -- sorry about the delay, we'll get back to you this week.

Note that our relaxation experiments didn't use Relax IV, but rather the custom Flowlessly relaxation implementation, which made several improvements over Relax IV. Furthermore, our cost-scaling implementation is custom (rather than based on CS2) and contains some improvements over it. For example, CS2 takes ~60s to solve the full-scale Google trace graph, while our implementation takes ~44 sec. We can make the code for this implementation available to you, as @ICGog mentioned.

lilelr commented 6 years ago

@ms705, @ICGog , thanks for your help. Have you sent the complete codes of Firmament to me via lilelr@163.com or ll.li@siat.ac.cn? Or could you tell me how I could get it?

ICGog commented 6 years ago

@lilelr Sorry for the delay! I've sent you an email with the source code. Let us know if you encounter any issues in reproducing the experiments.

ms705 commented 6 years ago

I'm assuming this is sorted, reopen if any questions remain.