DistCompiler / pgo

PGo is a source to source compiler from Modular PlusCal specs into Go programs.
https://distcompiler.github.io/
Apache License 2.0
174 stars 14 forks source link

[Proposal] Sending network messages across processes #77

Closed rmascarenhas closed 6 years ago

rmascarenhas commented 6 years ago

Abstract

This draft starts the discussion on how we can support the direct sending and receiving of messages across PlusCal processes (then compiled to processes that may live in different nodes in a distributed system). The main advantage of supporting direct communication is that the resulting systems won't have to rely on global state (which needs to be exclusively acquired before use) nor on abstractions from the model checking world that have a high maintenance cost in a realistic implementation.

A network in PlusCal

PlusCal (or TLA+) has no notion of a "network". You are free to model a network however you like (and to any level of abstraction: a network may drop packets, become unavailable, duplicate messages, corrupt messages, etc).

In the two-phase commit currently being developed, network is modeled as records that associate a <from,to> tuple with a sequence of messages (that's a high-level description of the model). The relevant definitions are included below for convenience:

  define {
    send(from, to, msg) == [network EXCEPT ![from][to] = Append(@, msg)]
    broadcast(from, msg) == [network EXCEPT ![from] = [to \in 1..NumProcs |-> IF from = to THEN network[from][to] ELSE Append(network[from][to], msg)]]
  }

  macro rcv(dst, buf) {
    with (src \in { s \in 1..NumProcs : Len(network[s][dst]) > 0 }) {
      buf := Head(network[src][dst]);
      network[src][dst] := Tail(@)
    }
  }

As can be seen, the definitions above model a network with no losses or failures as a global data structure that holds all messages. Processes can send, receive, and broadcast messages. Messages can be of any form (or "type"), since there is no restrictions on the TLA+ side of things.

While PGo could naively compile the network modeled above as an actual global data structure, that would severely hurt the performance of the resulting systems:

In order to avoid these problems, PGo could be smart enough to translate message-based communication directly.

Proposal: Network operators

In the specification above, send, rcv, and broadcast are "network operators": they do nothing but maintain state of the global network data structure. PGo could support a restricted definition of such operators in order to generate code that share the same semantics.

Example: telling PGo our network operators in the compilation configuration file:

"networking": {
    "operators": {
        "variable": "network",
        "send": "send",
        "receive": "rcv",
        "broadcast": "broadcast"
    }
}

The meaning of the options above should be straightfoward.

The configuration above would cause PGo to compile the two-phase commit spec differently in the following aspects:

  variable network = [from \in 1..NumProcs |-> [to \in 1..NumProcs |-> <<>>]];

The compiler would also skip the definitions of the network operators themselves (send, broadcast, and the rcv macro).

    network := send(rm, TM, [type |-> state, rm |-> rm]);
runtime.SendMsg(rm, TM, map[string]string{
    "type": state,
    "rm": rm,
})
rcv(TM, tmsg);
// buf declared as map[string]string
// blocks until a message is received
runtime.RcvMsg(TM, buf)

Other Specifications

Both specs model messages as TLA+ records, and therefore fit the model proposed here. Such specs could reasonably be changed to accommodate the restrictions proposed.

Limitations

rmascarenhas commented 6 years ago

Closing due to ongoing work on #75.