bitwalker / libcluster

Automatic cluster formation/healing for Elixir applications
MIT License
1.97k stars 188 forks source link

General reconnect failure #148

Open balena opened 4 years ago

balena commented 4 years ago

DNS poll, Kubernetes, Kubernetes DNS, Kubernetes DNS SRV and Rancher strategies all don't reconnect nodes case they disconnect for any reason. They need that the backing service discovery also bounces the list, which doesn't always (never?) happen.

This generated the series of PR's https://github.com/bitwalker/libcluster/pull/143, https://github.com/bitwalker/libcluster/pull/144, https://github.com/bitwalker/libcluster/pull/145, https://github.com/bitwalker/libcluster/pull/146 and https://github.com/bitwalker/libcluster/pull/147.

The logic can be greatly simplified by passing the whole list of nodes to Cluster.Strategy.connect_nodes/4, as internally it already lists the connected nodes and connect only the "added" ones. Check here: https://github.com/bitwalker/libcluster/blob/master/lib/strategy/strategy.ex#L43.