mesos / storm

Storm on Mesos!
Apache License 2.0
138 stars 66 forks source link

Storm Rebalance Broken #226

Open JessicaLHartog opened 6 years ago

JessicaLHartog commented 6 years ago

After the merging of #200 and #213 rebalance of topologies no longer does anything. This is because there are no offers on which slots can be made when a rebalance happens unless there happen to also be other topologies needing assignments.

This is as a result of the way that Nimbus handles the TopologiesMissingAssignments component. A quick rundown of what now happens is:

Notably, if there are other topologies needing assignments at the same time as the :do-rebalance is executed, then the rebalance should work as expected.

This also is simply referring to the Storm UI "Rebalance" and its associated command. I have not tested this with the type of rebalance mentioned in the Storm documentation:

## Reconfigure the topology "mytopology" to use 5 worker processes,
## the spout "blue-spout" to use 3 executors and
## the bolt "yellow-bolt" to use 10 executors.

$ storm rebalance mytopology -n 5 -e blue-spout=3 -e yellow-bolt=10

However, I fully expect they hit the same logic in the Nimbus and this same behavior (or something similar) happens that way too.

JessicaLHartog commented 6 years ago

Possible solutions:

Write logic that scrapes ZK state to see if there are any topologies in REBALANCING state, and if there are stop suppressing Offers.

Positive(s):

Negative(s):

Write logic to hold on to some number of unused Offers so that rebalance does something

Positive(s):

Negative(s):

Identify a way to release the _offersLock in the first round of scheduling where we have topologies that need assignment, revive and collect Offers, then use them.

Positives(s):

Negative(s):