project-receptor / python-receptor

Project Receptor is a flexible multi-service relayer with remote execution and orchestration capabilities linking controllers with executors across a mesh of nodes.
Other
32 stars 21 forks source link

Better route broadcasts #200

Closed ghjm closed 4 years ago

ghjm commented 4 years ago

In the existing implementation, each node sends its own information to all nodes. This creates an inherent tension where constructing the routing table relies on the existence of the very routing table we're trying to construct, leading to various instabilities, race conditions and problems.

This PR changes the behavior so that we send everything we know to direct peers, who then repeat the message to their own peers in a fan-out arrangement. This is more robust because a given node is always in a position to know for sure what its own local connections are.

Because this change requires a different message format ("seen" is gone, etc), the new-style routing messages are labeled ROUTE2. In the case where old-protocol nodes attempt to join a network with new-protocol nodes, the new-protocol nodes will log error messages. No attempt is made to allow interoperability between the two routing protocols.

ghjm commented 4 years ago

After several failed attempts, I think I have something that works well and handles all the edge cases I can come up with.

The new routing protocol is link state routing. Each node advertises its own connections, including those that are active and those it knows about from its own connection manifest, to its neighbors. On receiving these broadcasts, the neighbors then repeat the message to their own neighbors (except the one it came from), and so on until it reaches the whole network. This is O(n^2), just like the old protocol.

In order to avoid endless re-broadcasting of a routing update in the case where the topology includes loops, each routing update is given a unique identifier. If we see an update we've already seen, we ignore it.

In order to avoid incorrect updates when we receive older information after we've already received newer information, each update is given a sequence number. We only accept updates that have a higher sequence number than the highest we've seen so far from the node. In the case where a node re-starts and does not know its own sequence number, it re-synchronizes when it sees an update about itself with a higher sequence.

One area that still needs work is ephemeral nodes. They have no very good way of knowing when all the nodes on the network have converged to a routing table that includes the new node, and therefore it is safe to start pinging / report status / etc. I don't want to spend a lot of time perfecting this if we're going to switch to having ping/status/send work via a local unix socket, which seems to me a better approach.