Implement a first pass at hole punching (3/5)

geoffxy commented 6 years ago

This implements a first pass at hole punching. It really only handles the "happy" case, meaning there are no keep alive messages being sent and no connection time outs. I didn't want this change set to get larger than it already is. 😅

I left in the direct connection functionality. So the plugin still has the option of sending a ConnectTo message to "establish a connection" to another agent without using hole punching.

To join a session using hole punching, the plugin sends either the JoinSession or HostSession message to the agent. If a HostSession message is sent, the agent will reply with SessionInfo that contains the session id.

When connecting to a peer, we first go through a phase where we send Pings and count PingBack messages. This is to punch the hole and to figure out which of the peer's two addresses are routable. When we receive a sufficient number of replies (arbitrarily set at 3 right now), the peer is "promoted". At this point the connection "initiator" will send a Syn message to the other peer, which will reply with a NewOperations message to complete the connection set up.

I was originally thinking that we needed a 3 way handshake to transition from pings to completing the set up, but I don't think that's necessary anymore because we already exchange pings to make sure we can contact the other peer. So only a single round trip (Syn and NewOperations) is really needed to have both peers agree to complete connection set up. To work around lost messages, the Syn is repeatedly sent until NewOperations is received, and the other peer will always reply to Syn messages.

I added PingingPeer to represent a peer that we haven't "promoted" yet. The reason I added it is because with a regular Peer we assume there is one address. But when we're pinging the peer for connection set up, there are 2 addresses (and we cannot be sure that the private addresses of all these peers will be unique). So the way we deal with these peers needs to be a bit different.

I also added a CombinedProtocolHandler to support receiving JSON messages from two or more sources from one gateway. This is used for processing interagent messages and rendezvous messages since they need to both be sent from and received at the same UDP host/port pair.

geoffxy commented 6 years ago

Thanks for the review! I think I got around to addressing all the comments.

For the handlers, the address was being passed around initially which is why I had refactored it that way. But yeah I totally agree that it's best to actually use the abstraction that you introduced with the data classes. So I changed the base handler to deal with io_data exclusively instead of the data and address. Since we use a pattern where the address is passed to every message handler function, I added a _extra_handler_arguments(io_data) template function in the base class that subclasses can override. That way each handler can specify how it wants its custom data inside the data class passed to each handler.

For the io_data is None check, yeah I agree that it's better to just do the check in the handler. I moved it there along with the other refactor. This was an artifact of an edge case we missed when doing the UDP refactor. Since retrieve_io_data() won't return anything if the fragment is not complete, we would try to call get_address()/get_data() on None. This doesn't crash the agent though and the exception is swallowed because it's running on the executor. But in any case I think we should still check to make sure it's not None.

I added some temporary code in the rendezvous server to make it compatible for the time being. But I guess the long term solution is to change ConnectionManager there to use the same IO abstractions in the agent.

geoffxy commented 6 years ago

I also removed the PeerStore since there's a 1 to 1 relationship between the Connection model and Peer. Since we only really use Connection throughout the agent I think it makes more sense just to store the peer within the connection.

typeintandem / tandem

Implement a first pass at hole punching (3/5) #96