Modular PlusCal [proposal]

DistCompiler / pgo

PGo is a source to source compiler from Modular PlusCal specs into Go programs.

https://distcompiler.github.io/

Apache License 2.0

172 stars 13 forks source link

Modular PlusCal [proposal] #75

Closed fhackett closed 5 years ago

fhackett commented 6 years ago

This issue documents my proposal to extend the PlusCal language with features that enable composability.

Currently one PlusCal algorithm cannot interact with another in any way. They cannot refer to each other during model checking, and there is no consistent way to link PGo-compiled implementations together via shared variables since the algorithms assume exclusive access and this is liable to break invariants without significant modifications to the generated code.

The minimal set of features required for PlusCal to support modules are:

a parameterised variant of processes. Tentatively called "archetypes", these would be the same as processes but would be able to communicate with unknown outside processes via channels. These would be compiled into Go and would be controlled by communicating with them via the channels.
a way to define channels that would be used during model checking
a way to instantiate "archetypes", which would be used in order to generate a concrete model for TLC to check (TLC needs a specific set of processes and globals to work with)
PlusCal statements to send and receive over channels
an import directive to load other PlusCal algorithms (required since the compiled TLA+ needs to be very careful about managing global variables, native TLA+ imports and a TLA+ API are insufficient)

Example of what a PlusCal developer would do in order to use these things, assuming they are writing a system like RAFT where all the processes have the same code:

change the single process declaration to an "archetype" declaration, with parameters for any decisions that would normally be modeled using nondeterminism (like what to propose, etc... things that are needed by an implementation but not a model)
add instantiations of that "archetype" to the PlusCal algorithm, providing nondeterministically chosen values for all parameters
compile to TLA+, check that the instantiation of the algorithm works using TLC
compile to Go and write a small amount of Go code that starts/sets up an appropriate set of RAFT nodes. Note that the archetype is what was compiled, so the hand-written Go code is responsible for deciding what the real parameter values should be. This separates the redundant non-determinism used for model checking and actually meaningful non-determinism that is part of how the algorithm works, allowing users to cleanly replace the random model checking values with real implementation values while leaving the rest of the algorithm alone.
use the RAFT implementation to do things

Hopefully this give a decent outline of what I'm talking about - let me know if anything is unclear.

fhackett commented 6 years ago

Here is a more fleshed out example of the transformation.

Before

--algorithm counter {
  variables counter = 0;

  process (P \in 0..procs-1)
  variables i = 0;
  {
      w: while (i < iters) {
          incCounter: counter := counter + 1;
                      print counter;
          nextIter:   i := i + 1;
      }
  }
}

After

--algorithm counter {
  archetype P(self, counter, iters)
  variables i = 0;
  {
      w: while (i < iters) {
          incCounter: counter := counter + 1;
                      print counter;
          nextIter:   i := i + 1;
      }
  }
  (** testing **)
  variables counter = 0;
  process (P \in 0..procs-1) = instance P(self, counter, iters);
}

Now, if you wanted to use this algorithm from another PlusCal module:

--algorithm UseCounter {
  import counter;
  (** counter is not very useful, so we just demonstrate another process reading the variable **)
  process (Q \in 0..procs-1) = instance P(self, counter, iters);
  process (Observer = -1)
  variables i = 0;
  {
    w: while( i < iters ) {
      i := i+1;
      print counter;
    }
  }
}

Or if you want to use it from custom Go code (code is approximate, precise model for "global" variables needs work):

import "counter"

var counter int = 0;

func main(){
  // you can do anything here (incl. goroutines), but for simplicity we just call P directly
  counter.P(counter.MakeContext(/*TBD*/), 1, &counter, 15)
}

Here is an approximation of counter.go:

package counter

func P(/*context, TBD */, self int, counter *int, int iters){
  // generated code goes here
}

fhackett commented 6 years ago

Extra thoughts on codegen for globals in this context: what if we used unbuffered channels as locks?

As in, at the beginning of the program you send the initial values of all globals on channels. You then pass those channels to the relevant processes and when a process needs that variable they perform a receive. If that variable is available its value will be received. If it has already been received any other processes trying to receive it will block. When the process is done with the variable, it sends the new (possibly updated) value back along the channel.

The key advantage here is that you can pass channels around easily, whereas passing pointers around requires separately tracking a lock mechanism. This is likely to not compose as well in a modular context.

rmascarenhas commented 6 years ago

A couple thoughts/questions:

1) Have you thought about the precise semantics of your import keyword? Does it bring to scope only archetypes. or macros/procedures/operator definitions/globals are also imported?

2) Is this going to help you in your module system idea or is it unrelated? The "modular" in the name makes me think so, but I can't see how exactly.

3) Perhaps a better example to illustrate your idea would be to use the 2pc spec instead of counter. Even more useful would be to use the "official" specifications (available at http://lamport.azurewebsites.net/tla/two-phase.html), since they are written in more idiomatic PlusCal than the modified version in our examples directory. If you have the time, it would be great to see a sketch of how that spec would change to fit the proposal here.

fhackett commented 6 years ago

@rmascarenhas good questions.

At least archetypes. We would have to be very careful about scoping the rest though, since you'd have to keep in mind how it compiles to Go and TLA+. So: macros could be importable without problems. Procedures would not be so easy since they don't have proper return values, so you would need to assign to a global from the other module (which is broken as we should not be able to import globals). Globals should not be importable as that would break the separation archetypes gives us. Operators should be importable as normal from TLA+, so I'm not sure allowing them in PlusCal import would be useful.
this is my module system idea. It allows re-use of PlusCal compiled as Go (and possible swapping out of processes with alternate implementations).
I'll do that.

fhackett commented 6 years ago

Interesting point from a discussion with @mrordinaire yesterday, clarifying that my idea for Module PlusCal is directly replacing the PlusCal translator rather than compiling from Module PlusCal to regular PlusCal.

The important insight is that regular PlusCal is fundamentally broken when it comes to variable priming (that is, correctly generating ' and UNCHANGED in the generated code). Modular PlusCal changes the way globals are treated in subtle but important ways by altering the compilation strategy.

fhackett commented 6 years ago

P2TCommit.tla before

--algorithm TransactionCommit {
  variable rmState = [rm \in RM |-> "working"] ;
  define {
    canCommit == \A rm \in RM : rmState[rm] \in {"prepared", "committed"}
    notCommitted == \A rm \in RM : rmState[rm] # "committed" 
   }
  macro Prepare(p) {
    when rmState[p] = "working";
    rmState[p] := "prepared" ;    
   }

  macro Decide(p) {
    either { when /\ rmState[p] = "prepared" 
                  /\ canCommit ;
             rmState[p] := "committed"
           }
    or     { when /\ rmState[p] \in {"working", "prepared"}
                  /\ notCommitted ;
             rmState[p] := "aborted"
           }  
   }

  fair process (RManager \in RM) {
    start: while (rmState[self] \in {"working", "prepared"}) { 
      either Prepare(self) or Decide(self) }
   }
}

P2TCommit.tla after

--algorithm TransactionCommit {
  define {
    canCommit(rmState) == \A rm \in DOMAIN rmState : rmState[rm] \in {"prepared", "committed"}
    notCommitted(rmState) == \A rm \in DOMAIN rmState : rmState[rm] # "committed" 
   }
  macro Prepare(selfState) {
    when selfState = "working";
    selfState := "prepared" ;    
   }

  macro Decide(selfState, rmState) {
    either { when /\ selfState = "prepared" 
                  /\ canCommit(rmState) ;
             selfState := "committed"
           }
    or     { when /\ selfState \in {"working", "prepared"}
                  /\ notCommitted(rmState) ;
             selfState := "aborted"
           }  
   }

  archetype RManager(selfState, rmState) {
    start: while (rmState \in {"working", "prepared"}) { 
      either Prepare(selfState, rmState) or Decide(selfState, rmState) }
   }

  (* model checking section *)
  variable rmState = [rm \in RM |-> "working"] ;
  process (RManagerProc \in RM) == instance RManager(rmState[self], rmState);
}

fhackett commented 6 years ago

In consideration of #77 , here are my thoughts on how custom implementations might apply to Modular PlusCal.

The simplest option for a variable to just be variable. It is the least efficient and most general. It is also completely oblivious to packet drop and/or duplication at the moment.
As PlusCal stands, network concepts are implemented in an ad-hoc manner, making the general extraction of working Go code infeasible.
What we're trying to do here is allow users to map from PlusCal concepts to implementation concepts in a general and consistent way.

Example 1: if a network between two processes is modeled by a global variable and another process that "interferes" with that variable, we model reads and writes to that global as calls to an opaque object (taken as a Modular PlusCal parameter) that passes these through to a real network.

Example 2: if a process needs a timer, it can communicate with a second process via a shared global variable that "pretends" to keep time (since PlusCal is not realtime). In Go that variable can be modeled as an opaque object (taken as a Modular PlusCal parameter) that implements real semantics.

In summary, we can implement almost any semantics with a combination of model-check-only processes and flexible compilation of variable reads and writes. For ease of use we should provide a Go library of network/OS primitives that can be plugged into this mechanism. If someone wants something we don't have, the mechanism should be defined as an interface with a few necessary methods (acquire, release, ... not sure what else) so they can implement extras as necessary.

Extra note: this change would also remove the requirement for locking on global variables, and I'm not sure what place lock groups would have here. You could still have custom Go modules that use locking internally, but if a point to point network is correct for some variables you could dispense with locking semantics entirely for those.

Extra note 2: to avoid locking/synchronisation bugs, we could have PGo automatically generate alphabetically-ordered "acquire" and "release" calls the same way the backends operate currently.

@bestchai any thoughts? This is the "purest" adaptation of PlusCal that I can think of, with no additional semantic changes to Modular PlusCal.

fhackett commented 6 years ago

Example: send/recv with retry

Below is a set of archetypes that implement a send/receive pair with ack. This will be the basis for trying to express some custom error functions in Modular PlusCal.

archetype SendRetry(toSend, send, ack){
    variable currentMsg, seqNum = 0;
    w: while(TRUE){
        await Len(toSend) > 0;
        currentMsg := [ num -> seqNum, msg -> Head(toSend) ];
        toSend := Tail(toSend);
        retry: while(ack < seqNum){
            send := currentMsg;
        }
        seqNum := seqNum + 1;
    }
}

archetype Recv(recv, ack, receivedMsgs){
    w: while(TRUE){
        await recv.num = ack + 1;
        ack := ack + 1;
        receivedMsgs := Append(receivedMsgs, recv.msg);
    }
}

A lossless network (just "plug" the two instances into each other):

variables ack, pipe, toSend, receivedMsgs;

instance SendRetry(toSend, pipe, ack);
instance Recv(pipe, ack, receivedMsgs);

A lossy network with no reordering:

"vanilla" MPCal:

variables ack1, ack2, pipe1, pipe2, toSend, receivedMsgs;

instance SendRetry(toSend, pipe1, ack1);
instance Recv(pipe2, ack2, receivedMsgs);
process (Error = 1){
    w: while(TRUE){
        either {
            ack1 := ack2;
        } or {
            skip;
        };
        either {
            pipe2 := pipe1;
        } or {
            skip;
        };
    }
}

critique: introduces extra timesteps for the Error process to pass messages across pipe and ack, makes naming and tracking of state complicated for humans (look how long this simple example got).

with the "mapping" construct (semantics: within the scope of a specific instance, for mapping write <var> { ... }, replace writes to <var> with ..., where the original value to be written is called $value): [EDIT: CHOOSE does not cause branching, only the examples with either are semantically correct]
```
variables ack, pipe, toSend, receivedMsgs;

instance SendRetry(toSend, pipe, ack1)
    mapping write pipe { pipe := CHOOSE c \in { TRUE, FALSE } : IF c THEN $value ELSE pipe; }
instance Recv(pipe, ack2, receivedMsgs)
    mapping write ack { ack := CHOOSE c \in {TRUE, FALSE} : IF c THEN $value ELSE ack; }
```
critique: allows to alter the semantics of writes to variables in an archetype instance, so does not incur extra timesteps or naming difficulties. The pattern for "maybe change the value" is a bit repetitive though.

the "mapping" construct with the boilerplate factored out into a "Maybe" operator (operator defined inline for brevity):

variables ack, pipe, toSend, receivedMsgs;

Maybe(new, old) == CHOOSE c \in {TRUE, FALSE} : IF c THEN new ELSE old

instance SendRetry(toSend, pipe, ack1)
    mapping write pipe { pipe := Maybe($value, pipe); }
instance Recv(pipe, ack2, receivedMsgs)
    mapping write ack { ack := Maybe($value, ack); }

critique: similar to above, but shorter. opportunity for code reuse in the general case. It might be interesting to allow macros here as well.

Now for a similar exercise with a lossy and reordering-capable network (but with no packet duplication) ("vanilla" example omitted as it would only get longer and harder to get right):

variables ack = {}, pipe = {}, toSend, receivedMsgs;

instance SendRetry(toSend, pipe, ack)
    mapping write pipe { either {
                                         pipe := pipe \union { $value };
                                     } or {
                                         skip;
                                     }
                                   }
    mapping read ack { with chosen \in ack {
                                        ack := ack \ { chosen };
                                        read chosen;
                                    };
instance Recv(pipe, ack, receivedMsgs)
    mapping write ack { either {
                                         ack := ack \union { $value };
                                     } or {
                                         skip;
                                     }
                                   }
    mapping read pipe { with chosen \in pipe {
                                        pipe := pipe \ { chosen };
                                        read chosen;
                                    };

critique: while it is fairly clear what it going on here, this is getting longer again due to more duplicated code. Since we are effectively redefining a variable's read/write interface, we need a way to abstract away the "network-as-set" pattern.

Idea: factor out the duplicate code using a "mapping macro". This is a construct that encapsulates a complete remapping of a single variable. Assuming a library of standard constructs was available, this would make reuse of common communication models fairly straightforward.

variables ack = {}, pipe = {}, toSend, receivedMsgs;

mapping macro LossyReorderingNetwork {
    read {
        with chosen \in $value {
            $value := $value \ { chosen };
            read chosen;
        }
    }
    write {
        either {
            write $old \union { $value };
        } or {
            skip;
        }
    }
}

instance SendRetry(toSend, pipe, ack)
    mapping pipe via LossyReorderingNetwork
    mapping ack via LossyReorderingNetwork;
instance Recv(pipe, ack, receivedMsgs)
    mapping pipe via LossyReorderingNetwork
    mapping ack via LossyReorderingNetwork;

critique: what if we need two variables to be consistently remapped together, or a mapping only makes sense with a parameter?

fhackett commented 5 years ago

2PC and fail-stop

Mapping is useful when communication channels are fallible, but does not address when processes are fallible. One of the big ideas behind 2PC is logging and retry on failure, which needs simulated failure and "real" logging.

Below is a 2PC spec in Modular PlusCal that does not implement failure.

----------------------------- MODULE 2pc -----------------------------
EXTENDS Integers,Sequences,FiniteSets,TLC

(* --algorithm 2pc {

archetype Cohort(coordinatorCommunication, localSuccess, localResult) {
    waitForQuery:
        await coordinatorCommunication = "queryToCommit";
        coordinatorCommunication := IF localSuccess THEN "yes" ELSE "no";
    commitOrRollback:
        await coordinatorCommunication \in { "commit", "rollback" };
    setLocalResult:
        localResult := coordinatorCommunication;
    sendAck:
        coordinatorCommunication := "ack";
}

archetype Coordinator(cohortCommunication, success)
variables cohorts = {}, response;
{
    queryToCommit:
        cohorts := DOMAIN cohortCommunication;
        queryLoop: while (cohorts # {}) {
            with (c \in cohorts) {
                cohortCommunication[c] := "queryToCommit";
                cohorts := cohorts \ {c};
            };
        };
    waitForVotes:
        await \A c \in DOMAIN cohortCommunication : cohortCommunication[c] \in { "yes", "no" };
    responseToVotes:
        response := IF \E c \in DOMAIN cohortCommunication : cohortCommunication[c] = "no" THEN "rollback" ELSE "commit";
        cohorts := DOMAIN cohortCommunication;
        responseLoop: while (cohorts # {}) {
            with (c \in cohorts) {
                cohortCommunication[c] := response;
                cohorts := cohorts \ {c};
            };
        };
    waitForAck:
        await \A c \in DOMAIN cohortCommunication : cohortCommunication[c] = "ack";
    commitSuccess:
        success := TRUE;
}

variables comms = [ c \in 1..3 |-> "" ],
          cohortResult = [ c \in 1..3 |-> "" ],
          localSuccess \in [ 1..3 -> { TRUE, FALSE } ],
          coordinatorSuccess = FALSE;

(* testing *)
fair instance \in 1..3 of Cohort(comms[self], localSuccess[self], cohortResult[self]);
fair instance 0 of Coordinator(comms, coordinatorSuccess)

}
*)

In order to show that this algorithm cannot withstand fail-stop, I propose an additional mapping construct: the ability to inject code in between steps of an algorithm.

Here is how one would change the instance section of the above spec in order to add fail-restart cycles (and demonstrate that the algorithm as written is broken under that scenario, since if the cohort fails after receiving "queryToCommit" it will reboot and hang in that state):

(* testing *)
fair instance \in 1..3 of Cohort(comms[self], localSuccess[self], cohortResult[self])
    interleaving { goto waitForQuery; };
fair instance 0 of Coordinator(comms, coordinatorSuccess)
    interleaving { goto queryToCommit; };

Semantics of interleaving: generate TLA+ code such that the given statement(s) may or may not execute in place of the next label in the algorithm, unless the algorithm has reached the state "Done". If you do not want one of these statements to be always capable of executing, adding an await will make the execution conditional on whatever condition is needed.

rmascarenhas commented 5 years ago

I see some interesting ideas here, @fhackett, thanks for writing these detailed examples.

Comment 1
- is the emptyMsg constant some leftover from a previous version of your idea? If so, better remove it so it's less confusing.
- If I understand this right, toSend is a sequence of messages you want each instance of SendRetry to send, is that right? If so, it would make the example more realistic to actually pop from the list, or at least iterate over its elements. I know these are just drafts, but making them as close to "reality" as possible would help understand the ideas (at least I think it would).
- Another question: is toSend supposed to be local to instances of sendRetry? Does it matter?
- I don't think the Maybe operator has the meaning you intended it to. CHOOSE will not cause TLC to branch, so you wouldn't explore both cases (i.e., message is sent or dropped). I'm sure you know this since you used either in your following examples, so this is also in the "making the examples realistic" category.
- I'm not sure I understand the idea of "mapping macros". Is the idea of these macros always define read and write "operators"? I was confused by the fact that you have read chosen within the definition of read itself, but my current understanding is that in read chosen, read is some special keyword that only has meaning within a mapping macro. Is that right?
Comment 2
- Syntax nitpick: I think you probably meant to say fair instance 1..3 of Cohort? The \in currently there looks out of place.
- Not clear to me: the BOOL you added to localSuccesses is just a placeholder for an actual boolean the spec writer would use, or is that some special value in Modular PlusCal? You probably want something with the semantics of "either true or false" there (i.e., have TLC to branch every possible decision for every node in cohort).
- Is it right to say interleaving { S } can be desugared by wrapping every step with either { step } or { S }? If so, we can experiment with this in regular PlusCal to see if works well as a technique to express failure/recovery/restart.
General comments.
- This does seem to transform PlusCal into a different beast. It has the advantage of being a lot more general than #77, but on the other hand the changes to existing specs would be a lot more involved.
- I wonder how the compiler will optimize "mappings" as in the examples you have here, since the mechanisms seem to be pretty general.
- I guess a good exercise would be to try to write a complicated distributed algorithm in Modular PlusCal. Maybe WPaxos (which has an existing PlusCal spec), or Raft (a lot more work). Otherwise it's hard to know whether we are missing something critical. I can help with that.

fhackett commented 5 years ago

I've added responses in-line - thanks for your comments.

On Wed, Jul 25, 2018 at 4:22 PM, Renato Costa notifications@github.com wrote:

I see some interesting ideas here, @fhackett https://github.com/fhackett, thanks for writing these detailed examples.

-

Comment 1 https://github.com/UBC-NSS/pgo/issues/75#issuecomment-405432708

is the emptyMsg constant some leftover from a previous version of your idea? If so, better remove it so it's less confusing.

I can't remember why that was there, and can't find any references to it either. Removed.

-

If I understand this right, toSend is a sequence of messages you want each instance of SendRetry to send, is that right? If so, it would make the example more realistic to actually pop from the list, or at least iterate over its elements. I know these are just drafts, but making them as close to "reality" as possible would help understand the ideas (at least I think it would).

I have added a couple of lines in order to make the algorithm more correct - the corrected version is what was intended.

-

Another question: is toSend supposed to be local to instances of sendRetry? Does it matter?

toSend is how one would control an instance of SendRetry - you would pass in a list from Go and the algorithm would send the contents of the list (in theory you could update the list dynamically so you're not limited to some initial list value. That is the main reason why toSend is not local.

-

I don't think the Maybe operator has the meaning you intended it to. CHOOSE will not cause TLC to branch, so you wouldn't explore both cases (i.e., message is sent or dropped). I'm sure you know this since you used either in your following examples, so this is also in the "making the examples realistic" category.

My mistake. I had forgotten that that was how TLC worked. Added a note to the example.

-

I'm not sure I understand the idea of "mapping macros". Is the idea of these macros always define read and write "operators"? I was confused by the fact that you have read chosen within the definition of read itself, but my current understanding is that in read chosen, read is some special keyword that only has meaning within a mapping macro. Is that right?

Actually it's the other way around. Mapping macros mirror the way you would be able to pass in custom Go implementations of variables in Go, but for model checking. Whenever a variable is read (i.e any reference in a critical section), the corresponding "read" code would run. Same for write, but with assignment. The "read" and "write" statements are a way of specifying the effective result of the mapping macro, like "return" for functions.

-

Comment 2 https://github.com/UBC-NSS/pgo/issues/75#issuecomment-407263241

Syntax nitpick: I think you probably meant to say fair instance 1..3 of Cohort? The \in currently there looks out of place.

Sure, syntax is up for debate. The only issue is how to differentiate between sets of processes and single processes, since syntactically your change would be indistinguishable from a single instance named {1, 2, 3}.

-

Not clear to me: the BOOL you added to localSuccesses is just a placeholder for an actual boolean the spec writer would use, or is that some special value in Modular PlusCal? You probably want something with the semantics of "either true or false" there (i.e., have TLC to branch every possible decision for every node in cohort).

That is a mistake. Corrected to { TRUE, FALSE } (I must have forgotten to correct the code in the post - I found and fixed that bug in my test files before).

-

Is it right to say interleaving { S } can be desugared by wrapping every step with either { step } or { S }? If so, we can experiment with this in regular PlusCal to see if works well as a technique to express failure/recovery/restart.

That sounds like a valid desugaring. The actual process I imagined for direct-to-TLA compilation was to add extra branches nearer to Next, but your version should be equivalent if more verbose ... though one possible hurdle is control flow like while and if, which may be harder to convert. Worth a try at least.

-

General comments.

This does seem to transform PlusCal into a different beast. It has the advantage of being a lot more general than #77 https://github.com/UBC-NSS/pgo/issues/77, but on the other hand the changes to existing specs would be a lot more involved.

I agree. The original intention here was to convert our existing 2PC spec, but the changes required in order to specify a proper API were non-trivial. There is definitely a "you may have to rethink your code" disclaimer, but I think that is a fundamental property of adding an API to code that did not originally have one.

-

I wonder how the compiler will optimize "mappings" as in the examples you have here, since the mechanisms seem to be pretty general.

Mappings are for model-checking so they will not appear in the Go output, since the idea is to compile only the archetypes. If we did compile the mappings (say, in order to generate an example main function that runs the archetypes as described in the spec), they would probably just be expanded like macros and then be optimised (or not) like anything else.

-

I guess a good exercise would be to try to write a complicated distributed algorithm in Modular PlusCal. Maybe WPaxos (which has an existing PlusCal spec), or Raft (a lot more work). Otherwise it's hard to know whether we are missing something critical. I can help with that.

Any help would be appreciated there - I ran into a wall on the Raft front, and was unaware of WPaxos. One spec I'd like to add is 3PC, even if we just write it from scratch since it features timeouts and I want to make sure the mappings support that properly.

-

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/UBC-NSS/pgo/issues/75#issuecomment-407927096, or mute the thread https://github.com/notifications/unsubscribe-auth/AImmObSmwBUBXvAuUkGmacmxJ3h5ZSruks5uKP28gaJpZM4Uy_qH .

fhackett commented 5 years ago

Here is an example spec showing proposed pass by reference semantics for discussion later today.

The motivation is being able to use procedures inside archetypes, where the globals usually operated on by procedures are parameterised and thus not accessible with the original procedure semantics.

----------------------------- MODULE 2pc -----------------------------
EXTENDS Integers,Sequences,FiniteSets,TLC

(* --algorithm 2pc {

archetype Cohort(ref coordinatorCommunication, localSuccess, ref localResult) {
    waitForQuery:
        await coordinatorCommunication = "queryToCommit";
        coordinatorCommunication := IF localSuccess THEN "yes" ELSE "no";
    commitOrRollback:
        await coordinatorCommunication \in { "commit", "rollback" };
    setLocalResult:
        localResult := coordinatorCommunication;
    sendAck:
        coordinatorCommunication := "ack";
}

procedure Broadcast(ref cohortCommunication, msg)
variable cohorts = {}; {
        cohorts := DOMAIN cohortCommunication;
        queryLoop: while (cohorts # {}) {
            with (c \in cohorts) {
                cohortCommunication[c] := msg;
                cohorts := cohorts \ {c};
            };
        };
        return;
}

archetype Coordinator(ref cohortCommunication, ref success)
variables response;
{
    queryToCommit:
        call Broadcast(ref cohortCommunication, "queryToCommit");
    waitForVotes:
        await \A c \in DOMAIN cohortCommunication : cohortCommunication[c] \in { "yes", "no" };
    responseToVotes:
        response := IF \E c \in DOMAIN cohortCommunication : cohortCommunication[c] = "no" THEN "rollback" ELSE "commit";
        call Broadcast(ref cohortCommunication, response);
    waitForAck:
        await \A c \in DOMAIN cohortCommunication : cohortCommunication[c] = "ack";
    commitSuccess:
        success := TRUE;
}

variables comms = [ c \in 1..3 |-> "" ],
          cohortResult = [ c \in 1..3 |-> "" ],
          localSuccess \in [ 1..3 -> { TRUE, FALSE } ],
          coordinatorSuccess = FALSE;

(* testing *)
fair instance \in 1..3 of Cohort(ref comms[self], localSuccess[self], ref cohortResult[self]);
fair instance 0 of Coordinator(ref comms, ref coordinatorSuccess)

}
*)

fhackett commented 5 years ago

`ref` semantics

ref may be used in the parameter list of an instance, an archetype, a procedure definition or a call statement (macros are unaffected, since they simply expand syntactically)
ref is used in a definition to indicate that an argument is call by name rather than call by value
ref is used at a callsite to show that the parameter is being passed by name
ref may be used to qualify any parameter that is also assignable
ref may not be nested, and the global a ref refers to may not change once it has been initialised. Aliasing is possible, but well defined. Using to the ref by referencing it or assigning to it will directly affect the referenced global. In the context of aliasing, if a and b refer to the same global and one were to write a := b + 1;, the code would operate in exactly the same way as x := x+1. Likewise, if we alias a to x and b to x[6], the resulting code would operate in the same manner as if x[6] := x had been written. Aside from simple sharing like I have described here, no other aliasing is possible due to the inability to store a reference in a variable (ref has the same limitation as C++ references compared to pointers).
taking a ref of a ref will simply cause the second ref to refer to the same global as the first.
taking a ref of a local variable will have the same semantics as for globals

TLA+ compilation for `ref`:

Archetypes will be compiled to a series of operators (matching the names of existing labels). If there is a label op, typical process compilation will produce either op == ... or op(self) == ... depending on whether it is a single process or a set of processes. If an archetype with a label op takes ref var as an argument, the compiled label might look like op(var) == .... This works because TLA+ allows statements like var' = 3 even when var is an operator argument, so mutations to var will propagate to the operator callsite.

When compiling an instance of an archetype with label op and the parameter var, passing in ref someGlobal, the specification's Next disjunction will contain op(someGlobal). This means the archetype will affect the global while calling it var.

The same can be essentially said of procedures, but with the following restrictions:

trying to ref a procedure's local variable(s) should fail as they are reassigned as part of the calling convention. Passing along ref arguments is fine.
different invocations of a procedure with different ref arguments should have different pc values, appearing in Next as something like op("specialPC", var) instead of just op(var) like with archetypes.
mapping macros require special consideration (see below)

What happens if you take a `ref` of a variable that is backed by a mapping macro?

The simplest approach would be to recompile the procedure once per mapping, injecting the appropriate code on variable reads and writes. There may be nicer ways to do it, but for now this should keep things (relatively) simple (see below).

Bonus: mapping macro applied to `x.y := z`

Considering the above raised the question of how in general a mapping macro should respond to x.y := z, where x is subject to the mapping macro.

It is easy to consider what happens in the case x := y, since that might obviously translate to the macro contents with $value replaced by y and write xx replaced by x := xx. The tricky part is noticing that x.y in x.y := z refers to .y of the mapped x, not its backing value.

This can be resolved like so: if original is the value returned by the read portion of the mapping macro, $value should be defined as [ original EXCEPT !.y = z], with write xx defined as x := xx. In order to be able to convert between x.y and an EXCEPT expression consistently, we need to be aware that the left-hand side of an assignment can only be a plain variable or an index of some kind. Anything else is either invalid at the TLA+ level or forbidden by the PlusCal translator.

fhackett commented 5 years ago

`x.y := z` with mappings pt. 2

Having (re)read the PlusCal grammar (my bad for claiming there wasn't one, it's the formal semantics that are missing not the grammar), I can confirm that an assignment LHS is limited to precisely some identifier a followed by a dot .b or an index [c, d ...].

For the following series of statements:

a.b := c;
a.d := a.b;
d[5] := a.d;

A reasonable translation might be:

LET aRead == <result of applying read mapping to a goes here>
    aPrime == [aRead EXCEPT !.b = c]
    aPrimePrime == [aPrime EXCEPT !.d = aPrime.b]
    dRead == <result of applying read mapping to d goes here>
    dPrime == [dRead EXCEPT ![5] = aPrimePrime.d]
IN
    /\ a' = <result of applying the write mapping to aPrimePrime goes here>
    /\ d' = <result of applying the write mapping to dPrime goes here>

Obviously there are a lot of optimisations that can be done in special cases like the code not containing assignments to fields or indices, but I can't think of any cases that would cause this translation to be semantically inaccurate.

(caveat: real codegen may be uglier due to nested let expressions - imagine what would happen if you put a print statement in between some of those assignments and we had to add a /\ case in the middle of the let declarations)

rmascarenhas commented 5 years ago

Thanks for thinking this through, @fhackett!

A couple of questions/comments follow:

Not sure I understand both of these statements:

different invocations of a procedure with different ref arguments should have different pc values

The simplest approach would be to recompile the procedure once per mapping

Maybe examples would make it easier for me to understand what you propose.

I find the strategy of mapping macros with TLA+ functions (or tuples/records) a bit confusing and I think we may benefit if we can avoid those murky waters if at all possible.

In your example:

a.b := c;
a.d := a.b;
d[5] := a.d;

the read macro for a is only invoked for a.b. I had the understanding before that read macros would be evaluated whenever a variable is read, but this example makes me think the semantics are more complicated. And if that's the case, we should write down these semantics more carefully.
what if y already has a mapping macro in x.y := z? Are we giving them precedences? Invoking both (in some order)? Compilation error (this could be tough to catch statically with aliasing)?
The write macro is also only invoked once, despite the fact that two write operations were performed.
What would happen if I had something like:

procedure MyProc(ref x) {
  x := 10;
}

archetype Arch(ref map)
variable localMap = [one |-> 1] {
  call MyProc(ref localMap.one)
}

Would it be allowed? Were you planning to have MyProc receive the entire map when translating that to TLA+? You seemed to imply that we would allow "interior references", but I wonder how much it would complicate things.

My goal here is to make things more straightforward not only in terms of PGo's implementation, but also understanding what's going on under the hood when we are writing specs in modular PlusCal. The semantics seem to be pretty nuanced and I'm thinking how we could drop features without sacrificing expressiveness.

One possibility would be to not support mapping macros on tuples and records (i.e., not at the granularity you are proposing here). But things could still work if we restricted their assignment to tupl := [tupl EXCEPT ![idx] = val], which should be fine for most needs.

fhackett commented 5 years ago

More compilation thoughts and discussion

I have tried some more experiments and it seems that any "smart" compilation where we parameterise references using TLA+ tricks will not work. The problem is generating correct UNCHANGED expressions, which requires knowing every variable that has not been assigned to in any given step.

Under a mapping macro that might assign to any global variable during its execution, it is impossible to statically compute what side-effects any part of a Modular PlusCal algorithm might have. Attempts at expressing dynamic computation of this have failed with TLC errors, so I'm leaning toward "impossible as far as I can tell" as the status of "smart" compilation of Modular PlusCal.

So, this leaves us with complete static expansion of ref and mapping macro, duplicating any critical sections that are affected by the semantics of these constructs. At this state we can expect compiled TLA+ to look a lot like the output of the original PlusCal translator, but possibly generating multiple different versions of labels.

@rmascarenhas to address your comments in order:

I find the strategy of mapping macros with TLA+ functions (or tuples/records) a bit confusing and I think we may benefit if we can avoid those murky waters if at all possible.

it seems that was impossible anyway (see above). I agree, that was confusing.

the read macro for a is only invoked for a.b. I had the understanding before that read macros would be evaluated whenever a variable is read, but this example makes me think the semantics are more complicated. And if that's the case, we should write down these semantics more carefully.

I see. ~To clarify, what is happening here is that a is being read once. Multiple references within a critical section act as a single atomic read, so by this logic the read mapping must be executed once. The confusing addition that we are reading a.b and a.d is due to TLA+'s weird priming semantics. Normally you can't assign to a twice, but you can assign to different fields of the same a as many times as you like.~

I tried this in the TLA+ toolbox to double check and I have no idea how I reached that idea. Thank you for making me check this. The correct semantics are that a.b := c is a shorthand for a' = [a EXCEPT .b = c ], plain and simple. Well that makes this a lot easier. First compile to the EXCEPT form, then feed the value to the write macro. Done!

what if y already has a mapping macro in x.y := z? Are we giving them precedences? Invoking both (in some order)? Compilation error (this could be tough to catch statically with aliasing)?

You can't actually map y. Since the only way y could be mapped is by having it be a ref parameter to an archetype, accessing a field y of x just assumes that x was mapped to a record with a field y. Put another way, you can provide a field of an existing object as a ref parameter, but you can't somehow change the fields of something passed via ref to be anything new.

This does bring up the issue of passing the same object in multiple positions as ref parameter, but as far as I can see we can just reject that if we notice references to the same thing in different parameters to the same call (since this is just asking for multiple-assignment compilation errors). Good thing to think about though, thanks for asking that.

The write macro is also only invoked once, despite the fact that two write operations were performed.

See my response to read. Seems like this was never a thing. :man_facepalming:

Write macros will be executed only once, not because of some weird logic but because you can only assign to something once in the same critical section.

What would happen if I had something like: [...]

Given our strategy to statically expand these things, assignments to parameter x with value ref localMap.one would compile as if they were written localMap.one := ..., which would compile as discussed above. Aren't I glad you made me re-evaluate some things... otherwise my answer would have been much less simple.

fhackett commented 5 years ago

Progress update on PlusCal translator:

there is a new flag -tlagen that causes PGo to pretend to be the PlusCal translator
Queens.tla and Euclid.tla can be translated (needs proper verification / much better code formatting, but the output at least looks plausible)
most control flow except for procedures (not implemented yet) seems to work
only single-process compilation is supported at the moment (but what is there should be easily portable to multi-process)
I have an untested skeleton for compiling ref and mapping semantics

Aside: it seems that you can write a.a := b || a.c := d (specifically on the same line), which compiles to [a EXCEPT !.a = b !.c = d]. This does not raise any issues with anything I posted above, but is an edge case that we need to account for.

rmascarenhas commented 5 years ago

Closing this in favor of the Wiki entry on Modular PlusCal, which should include the result of all the discussion that happened here and offline.

DistCompiler / pgo

Modular PlusCal [proposal] #75

Before

After

P2TCommit.tla before

P2TCommit.tla after

Example: send/recv with retry

2PC and fail-stop

-

-

ref semantics

TLA+ compilation for ref:

What happens if you take a ref of a variable that is backed by a mapping macro?

Bonus: mapping macro applied to x.y := z

x.y := z with mappings pt. 2

More compilation thoughts and discussion

`ref` semantics

TLA+ compilation for `ref`:

What happens if you take a `ref` of a variable that is backed by a mapping macro?

Bonus: mapping macro applied to `x.y := z`

`x.y := z` with mappings pt. 2