hseeberger / constructr

Coordinated (etcd, ...) cluster construction for dynamic (cloud, containers) environments
Apache License 2.0
212 stars 37 forks source link

Risk of island creation if max-nr-of-seed-nodes < actual number of seeds and host is reused #159

Open octonato opened 7 years ago

octonato commented 7 years ago

In ConstructrMachine.scala line 197

  private def seedNodes(nodes: Set[Address]) =
    nodes.take(maxNrOfSeedNodes).toVector

The following scenario can happen:

nodes = Set(addressA, addressB, addressC, addressD, addressE, addressF)
max-nr-of-seed-nodes = 3

addressA, addressB and addressC have crashed and infrastructure is redeploying new nodes. While redeploying, hosts are reused. So, for instance, new node gets same IP and port as addressC.

If seedNodes method returns by accident (is a Set) exactly the same nodes that are gone (ie: A, B and C). A node trying to join with addressC will be unable to see addressD, addressE and addressF and will happily make a cluster with himself because his host is on the list of returned addresses.

Obviously this only happens when max-nr-of-seed-nodes is set to something different to the default. I guess this property is most useful for large clusters to avoid sending join request to a large number of nodes, however the risk still exists.

Not sure what could be a good solution for this though.

jasongoodwin commented 7 years ago

Because a cluster should never be created when "Joining" in constructr, it seems important that the self address can never be provided in the seed node list passed to Akka. It might make sense for Constructr to log a warning, then filter or fail if it sees the self address in the first spot to ensure that nobody accidentally configures their system in a way that would allow two clusters to form.

hseeberger commented 7 years ago

Good catch! I will rework general semantics soon (single cluster formation), then this should be resolved.