mesos / storm

Storm on Mesos!
Apache License 2.0
138 stars 66 forks source link

Shuffle offers before turning them into worker slots. #220

Closed JessicaLHartog closed 6 years ago

JessicaLHartog commented 6 years ago

We've found that with few topologies running and limited resource demand, the iteration over the list of hosts with offers results in all worker slots being assigned to the same physical machine more often than not. This is less than ideal.

This change is to shuffle the offers before we make any slots on them, so as to distribute things more evenly.

JessicaLHartog commented 6 years ago

Before, using grep "Schedule the per-topology slots:" nimbus.log

2017-10-09T21:29:27.485+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker4:31010, worker2:31000]}
2017-10-09T21:30:07.709+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker4:31008, worker2:31000]}
2017-10-09T21:32:18.431+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker4:31007, worker2:31000]}
2017-10-09T21:34:29.183+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker4:31007, worker2:31002]}
2017-10-09T21:35:19.465+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker4:31008, worker2:31000]}
2017-10-09T21:37:10.093+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker4:31007, worker2:31000]}
2017-10-09T21:39:30.887+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker4:31010, worker2:31000]}
2017-10-09T21:40:21.338+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker4:31008, worker2:31000]}

Of 8 slot assignment rounds, distribution of assignments to workers (x,y) are at the following frequencies:

assignments frequency
(1,2) 0
(1,3) 0
(1,4) 0
(2,3) 0
(2,4) 8
(3,4) 0

After a re-deploy, grepping for the same code on the same file:

2017-10-09T21:42:01.926+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker4:31007, worker2:31000]}
2017-10-09T21:43:02.382+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker3:31002, worker2:31000]}
2017-10-09T21:44:33.198+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker3:31005, worker1:31000]}
2017-10-09T21:45:13.678+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker2:31000, worker1:31000]}
2017-10-09T21:46:54.442+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker4:31007, worker3:31002]}
2017-10-09T21:47:34.729+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker2:31002, worker1:31000]}
2017-10-09T21:49:05.387+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker2:31000, worker1:31000]}
2017-10-09T21:49:55.716+0000 s.m.s.StormSchedulerImpl [INFO] Schedule the per-topology slots: {TestTopology, [worker4:31004, worker2:31005]}

Of 8 slot assignment rounds, distribution of assignments to workers (x,y) are at the following frequencies:

assignments frequency
(1,2) 3
(1,3) 1
(1,4) 0
(2,3) 1
(2,4) 2
(3,4) 1
erikdw commented 6 years ago

@JessicaLHartog : LGTM too. One minor note I'd like to record is that this doesn't "shuffle offers" directly; i.e., it shuffles groups of offers by their host. Just to be precise.

JessicaLHartog commented 6 years ago

@erikdw Yea, it doesn't actually shuffle the offers themselves either, it just shuffles the order in which we access hosts before making slots on them. Though I feel like that's a mouthful. I can change the commit message if you want. Suggestions?

erikdw commented 6 years ago

Nah, I would just ship it, just something I thought of.

On Oct 18, 2017, at 9:20 AM, JessicaLHartog notifications@github.com wrote:

@erikdw Yea, it doesn't actually shuffle the offers themselves either, it just shuffles the order in which we access hosts before making slots on them. Though I feel like that's a mouthful. I can change the commit message if you want. Suggestions?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.