PR #135 introduced a subtle bug where it was impossible to start a configuration where clusters have more than one endpoint. This is because failing to connect to any endpoint is now fatal. However, only the Raft leader of a cluster listens on its endpoint, so we expect the connections to fail in the normal case.
This PR makes connection failure only fatal when there is one endpoint, meaning the leader should always be listening.
In the future this connection logic should probably be more complex, and aware of whether we're connecting to a Raft cluster, checking that at least one endpoint connected. However, in practice it may be better to deal with this elsewhere to remove the constraint on the startup order of components. Maybe it's okay for failure to be a warning in most components rather than a fatal error.
PR #135 introduced a subtle bug where it was impossible to start a configuration where clusters have more than one endpoint. This is because failing to connect to any endpoint is now fatal. However, only the Raft leader of a cluster listens on its endpoint, so we expect the connections to fail in the normal case.
This PR makes connection failure only fatal when there is one endpoint, meaning the leader should always be listening.
In the future this connection logic should probably be more complex, and aware of whether we're connecting to a Raft cluster, checking that at least one endpoint connected. However, in practice it may be better to deal with this elsewhere to remove the constraint on the startup order of components. Maybe it's okay for failure to be a warning in most components rather than a fatal error.