dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.04k stars 2.02k forks source link

orleans rebalancing #4694

Open mohamedhammad opened 6 years ago

mohamedhammad commented 6 years ago

the load balancing and rebalancing grains into siloes is a very important topic,please provide us with an indepth look at it.

JillHeaden commented 6 years ago

HELP WANTED: can anyone provide me with this information? (Also, could someone please add the "Help Wanted" label to this issue?) @jason-bragg

satishviswanathan commented 6 years ago

Any documentation on this would be a great help.

xiazen commented 6 years ago

One load balancing mechanism we have is load shedding, currently based only on cpu of the silo. It can be configured in LoadSheddingOptions. This is by default turned off. If a silo cpu is higher than the limit configured in LoadSheddingOptions, then that silo won't accept messages, until its load shedd to a acceptable level.

Another load balancing mechnisam we have is ActivationCountPlacement placement strategy on grains. To my understadning, if a grain type is tagged with this placement strategy, its placement will be based on activation counts on each silo, meaning new activation will be placed on minimum loaded silo, in terms of activation count.

These two are mechnism I can think of now regarding load balancing. @jason-bragg may remember more.

serefarikan commented 3 years ago

I've been spending quite some time trying to figure out how or if Orleans provides the rebalancing capability of Akka clusters, especially with persistence. Both Akka and Akka.net seems to offer rebalancing when a new node is added to the cluster, and supporting this with persistence is a key scale-out feature for shared nothing architectures.

At the moment, I simply don't know if I can or cannot use Orleans for dynamic/elastic scaling of a cluster with persisted state. The documentation mentions Orleans can scale down when resources are not needed, but no mention of if you can scale up and no mention of how this interacts with backing storage either.

Clarifying this would allow making informed choices between Orleans and alternatives.

ReubenBond commented 3 years ago

@serefarikan Thanks for the input. Balancing of load happens when a grain is activated. Orleans will not proactively move already-active grains to a new host, at least not today. Grains can decide to deactivate themselves so that they can be reactivated on newly-added hosts. Idle grains are deactivated by the runtime and they can be placed on new capacity when they are reactivated.

Persisted state is not related: grains will load their state from the backing store during activation. Since critical state must be persisted eagerly (in any system), before returning a result back to a caller (otherwise a host failure, which can happen at any time, would result in that returned value being a phantom, non-durable value), so persistence doesn't affect scale-out. Does it affect scale-out in Akka?

serefarikan commented 3 years ago

@ReubenBond thanks for your response. Persistence affects scale out in Akka in the following way (assuming I'm not wrong): Akka will rebalance actors among nodes when a new node is added to the cluster. During this process, messages arriving to the actor that is in the process of moving to a new node are buffered and whether or not you lose data as a result of something going wrong at that point makes persistence relevant. My understanding is, persistence allows those 'pending' messages to be durable, but I could not yet clarify if those are then migrated automatically, or replayed/sent to actor once it's happy in its new home. Nonetheless, the point is, active actors can me moved as part of scaling out in Akka, and I was trying to understand Orleans behaviour in the same case. You did answer that question, but I wanted to clarify the background to me mentioning persistence.

Sorry for such a late clarification and thanks!