Closed bboreham closed 8 years ago
I reckon the isConnectedPeer
should be lifted out, i.e. the flag passed to doAddConnection
should simply be an indication as to whether this is a "new" peer, where "new" includes "restarted".
That indicator should be calculated by registerRemote
: a peer is "new" if a) it is not present in the topology, or b) is present and has a different UID, or c) is present and has no connections.
c) is present and has no connections.
We need this is because we may, temporarily, have multiple connections in progress to the same peer, which will keep it in the topology even though it may not have any (established) connections.
Looking at this again, perhaps we should leave the code in this PR as is. After all, it does address the problem, and is quite straightforward.
I have rebased and made the change suggested at https://github.com/weaveworks/mesh/pull/29#issuecomment-199783229.
A couple of aspects are now potentially hard to follow: the computation of isJustAddedPeer
which relies on the behaviour of fetchWithDefault
, and the absence of locking round len(conn.remote.connections)
, which (I think) relies on remote
not being the local peer and connections
never being modified after creation on a remote peer.
CircleCI tests have failed, which seems to be in the etcd
code and unrelated.
As discussed, the "has no connections" tests is probably too narrow. When two clusters come together, a 2nd in-progress connection between them may well find the other peer in the topology, with lots of connections to peers on its side of the cluster. We'd still need to send it the full gossip, so it can learn about our side of the cluster.
The symmetric unicast route check of the previous version is a conservative approximation of whether another peer is part of our cluster, which is kinda what we want here.
So I suggest we ditch the last commit.
I removed the last commit, so now we are back to checking the unicast topology.
LGTM. @peterbourgon what's up with the CI failures?
@rade It's the vendoring nonsense. Needs a rebase on master to get this.
Oops, it was the Circle build cache. How annoying.
Improves the behaviour reported at #28.