domfarolino / browser

Simple browser components
4 stars 1 forks source link

Basic mage implementation #32

Closed domfarolino closed 1 year ago

domfarolino commented 3 years ago

Mage IPC library

Test cases/scenarios, each with a sublist of tests that most minimally exercise the scenario:

Task list

Followup:

Test Health

Obsolete: Super random notes and thoughts: We have this idea of a mage::Endpoint being in a "proxying" state where its messages are queued internally, and then a caller can "Take()" them and forward them to the node that it is proxying to. I had the idea of putting a mage::Endpoint into a "proxying" state where it would take a remote node name, and modify its peer address to point to that remote name and then go through the normal sending route. However this might be tricky when we pass a mage::Endpoint in a mage message that is bound for the same process. We need to think about this case.

Interesting cases we need to think about:

  1. Creating two local pipes and handing them off to separate processes
  2. Creating two local pipes and handing one of them off to a separate process
  3. Creating two local pipes and handing one of them off to a same-process peer endpoint

(1) above is a generalized version of (2), so let's focus on (2). We have two options to do this:


Design for sending endpoints to other nodes

This section might be a bit obsolete, but much of it probably holds.

What follows is the design for sending endpoints to another node (process). Imagine you have endpoints A and B in node N1, and we send B to another node N2. We have two options:

  1. Delete N1.B, and update N1.A's peer address from N1.B to N2.B (this option is actually slightly more complicated than how I just presented it)
  2. Keep N1.B around, but set it into a proxying state so that it will forward messages to "the real B", N2.B

I think we need to go with (2) for now, here's why. Eventually mage (like mojo) should be able to pass two endpoints each to different processes, and have the originating process delete both original endpoints (option (1) above). That is, the two endpoints that live in separate processes can talk directly to each other without mediation from the process they originated in. This is pretty complex for two reasons:

  1. It requires sending actual file descriptors to other processes via something like CMSG headers
  2. It requires some pretty complicated messaging that mojo (not mage!) currently supports, where for a period of time after the originator sends the two endpoints to the two separate processes, it maintains "proxies" representing these endpoints that survive (and continue to forward any lingering messages) until the real endpoints are set-up in the other process and the proxies have no more messages to forward. That means the originator has to tell any other processes that might still sending messages to it, to stop and instead send its messages to the "real" destination process that we sent the endpoint to. Once the originating process receives an ACK from that the node (if any exists) that was still sending messages to the originator, the originator can be sure it will receive no more messages for the proxy, and it can delete the proxy.

We'll support this one day for performance/efficiency, but the consequences of this shouldn't be observable, and it is pretty complicated to implement so we're avoiding it for the initial version. In that case, when the originator process sends two endpoints two different processes so that they can talk directly to each other, each message is actually going through the originator process's "permanent" proxies.

This requires solution (2) above, where for each endpoint node N1 sends to node N2, node N1 keeps a permanent proxying endpoint around pointing to N2, where it believes the "real" endpoint was last seen. In reality, N2 might immediately send the endpoint to N3. But that's fine—when it receives messages from N1, they'll arrive at N2's proxy, which will point to N3, and the forwarding will happen automatically. Even once we support direct message communication, this sort of daisy-chain proxying will be necessary for a brief period of time until all of the ACKs in the chain settle.

At first I considered a design where each endpoint always had a local peer, and that peer would just be in a proxying state if it represented a concrete endpoint in another process. This was because I got confused and thought that endpoints-traveling-in-pairs was a requirement for design (2). This would imply that endpoints implicitly travel together:

Under this design, since endpoints travel in pairs, the only way to send a message to remote node would be if a message arrives at a local peer that is in a proxying state. Therefore, SendMessage() looks pretty simple:

Node::SendMessage(Endpoint local_endpoint, Message message) {
  Endpoint local_peer_endpoint = // get peer endpoint;
  if (local_peer_endpoint.state == kUnboundAndProxying) {
    // Peer is actually remote. Forward the message accordingly.
    Channel channel_for_remote_node = GetChannelForRemoteNode(local_peer_endpoint.node_to_proxy_to);
    channel_for_remote_node->SendMessage(message);
  } else {
    // `local_peer_endpoint` is the real peer endpoint
    local_peer_endpoint->AcceptMessage(message);
  }
}

But upon further thought, there is just no reason to require that all endpoints have local peers and therefore travel together even if you send one. All it does is slightly simplify the SendMessage() codepaths. Instead:

This is perhaps more logical, even though that means SendMessage() gets slightly more complicated. This is because there would be two ways of sending a message to a remote node:

Node::SendMessage(Endpoint local_endpoint, Message message) {
  bool peer_is_local = (local_endpoint.peer_address.node_name == node_name_);

  if (peer_is_local) {
    Endpoint local_peer_endpoint = // get peer endpoint;
    if (local_peer_endpoint.state == kUnboundAndProxying) {
      // Peer is actually remote. Forward the message accordingly.
      Channel channel_for_remote_node = GetChannelForRemoteNode(local_peer_endpoint.node_to_proxy_to);
      channel_for_remote_node->SendMessage(message);
    } else {
      // `local_peer_endpoint` is the real peer endpoint
      local_peer_endpoint->AcceptMessage(message);
    }
  } else { // peer is remote
    Channel channel_for_remote_node = GetChannelForRemoteNode(local_endpoint.peer_address.node_name);
    channel_for_remote_node->SendMessage(message);
  }
}

In other words, this design has two main rules:

  1. All sent endpoints are put in a permanently proxying state
  2. Endpoints do not travel in pairs (because they don't need to)

One day we'll implement the ability for two child nodes to talk directly to each other without mediation from the originator, but the flow will be more complicated. Remember, immediately after we send an endpoint to another node the flow will look like:

  1. Sender node puts the sent endpoint into a proxying state, in case it receives more messages from a node that might be sending messages to us
  2. Sender tells node owning the peer endpoint of the sent node (if any) to please send stop sending subsequent messages to us, and instead start sending them to the node that we sent the endpoint to (the "final destination" from the sender's perspective)
  3. Sender continues to forward any messages it receives to the node we sent the real endpoint to
  4. Sender receives ACK from the node owning the peer endpoint (if such node exists), and only then deletes its proxy.
  5. If the peer endpoint of the endpoint-we-sent is local (is not owned by another node), then we can just update said endpoint's "peer address" to point to the node that we sent the endpoint to, and that's as good as an "ACK"
musgravejw commented 1 year ago

🎉