Closed PeterHageus closed 1 year ago
This is an Akka.Cluster
behaviour and the short answer is "it depends".
When you set KeepMajority.Role
, what will happen when a split brain occured is that only the cluster members that has that role is being considered when SBR tries to resolve the split. This would mean that you will need at least 5 seed nodes for this to work properly in production.
But lets take some examples:
Cluster settings:
KeepMajority.Role
is set to "seed"Scenario 1, the happy path: The cluster split into these parts:
Scenario 2, the not-so-happy path: The cluster split into these parts:
Scenario 3, the ugly path:
The only way to fix this problem is to remove the arbiter inside the cluster, be it a Lighthouse instance or a fixed count of seed nodes. To do this, you will need Akka.Management.Cluster.Bootstrap
in combination with Akka.Discovery
which uses an arbiter outside of the cluster and is available for Kubernetes, Azure, and AWS.
Note that Akka.Discovery.Config
is not the answer. It is still using the cluster itself as the arbiter, which defeats the purpose.
OK, thanks for your input! Guess our only strategy atm is higher tolerance for heartbeats, to avoid unnecessary disconnects.
Hi. Don't know if this is a Akka.Hosting issue or Akka.Cluster, but we have a problem with the default configuration:
During cluster churn (high cpu load on servers) our seed nodes are sometimes downed. This leads to them forming a new cluster, but a minority part. Everything restarted/started after this connects to this minority part, while the majority can remain stable for at least 24h (until IIS recycles), leading to a long lived partition.
Would setting KeepMajority.Role to the seed node role only take the seed nodes into account when resolving partition? Would that be the correct way to configure the cluster?