akka / akka

Build highly concurrent, distributed, and resilient message-driven applications on the JVM
https://akka.io
Other
13.03k stars 3.59k forks source link

Node never joins a cluster when a list of seed nodes consists exclusively of more than one "selfAddress" #25820

Open izeigerman opened 5 years ago

izeigerman commented 5 years ago

The issue occurs when the list of seed nodes contains multiple occurrences of selfAddress and there are no other addresses in that list. Here is a reproducible example:

val cluster = Cluster(system)
val seedNodes = Seq(cluster.selfAddress, cluster.selfAddress)
cluster.joinSeedNodes(seedNodes)

The root cause of the issue is the following:

  1. We successfully pass the condition here: https://github.com/akka/akka/blob/master/akka-cluster/src/main/scala/akka/cluster/ClusterDaemon.scala#L605-L607 and instantiate the FirstSeedNodeProcess actor to join the cluster.
  2. In the FirstSeedNodeProcess we filter out all seed nodes that match the selfAddress instance: https://github.com/akka/akka/blob/master/akka-cluster/src/main/scala/akka/cluster/ClusterDaemon.scala#L1418 . The remainingSeedNodesends up empty.
  3. As a result the FirstSeedNodeProcess actor never sends the InitJoin message.

Despite the minority of this issue, it may cause some unexpected behavior in case when the seed-node-timeout is high enough.

The quick solution for this issue is to use Set here: https://github.com/akka/akka/blob/master/akka-cluster/src/main/scala/akka/cluster/ClusterDaemon.scala#L599. And if possible pass this Set instances into the FirstSeedNodeProcess and JoinSeedNodeProcess constructors.

I'd be happy to submit a PR if the problem and the proposed solution make sense.

patriknw commented 5 years ago

@izeigerman Thanks for reporting. That would be good to fix and a PR would be welcome.