DistCompiler / pgo

PGo is a source to source compiler from Modular PlusCal specs into Go programs.
https://distcompiler.github.io/
Apache License 2.0
174 stars 14 forks source link

Integrating the generated main function with aggregate mapping macros (#103) #104

Closed fhackett closed 5 years ago

fhackett commented 5 years ago

Originally we would generate a Go main function that just defaulted all archetype parameters to some generic implementation.

With aggregate (#103 ) mapping macros we need a slightly more sophisticated system.

Currently our config looks like this (for the counter MPCal example):

{
  "build": {
    "output_dir": "gen",
    "dest_file": "out.go"
  },
  "networking": {
    "enabled": true,

    "state": {
      "strategy": "state-server",
      "peers": ["10.0.0.1:1111", "10.0.0.2:2222", "10.0.0.3:3333"]
    }
  },
  "constants": {
    "iters": "10",
    "procs": "3"
  }
}

This assumes that each process will communicate with others via a single connection, so each peer's instance of some variable can be accessed knowing only IP:port.

For our work-in-progress channel-like abstractions, many of these things no longer hold:

So, let's imagine we have an instantiation of some archetype that looks something like this:

fair process (Consumer \in 1..NUM_CONSUMERS) == instance AConsumer(ref network, ref queueProcessor)
    mapping network[_] via LossyWrites;
fair process (Producer = ProducerId) == instance AProducer(ref network, ref queueStream)
    mapping network[_] via LossyWrites;
    mapping queueStream via CyclicReads;

Now while in the model checking context we just rely on expanding the mapping macro, in the Go implementation we want to replace it with something. What that thing is is not a thing we can infer, since it resides in the model developer's understanding of what LossyWrites means in reality.

The other problem we have to contend with is in the form of implementation details: how many connections does networks represent, and between whom? It's convenient for the model checker to just pretend everyone is connected, but this breaks down when we have to actually set up TCP (or something like that). Just connecting everyone may be technically correct (and is an option I suppose as long as you tell PGo what's a channel and what's a global variable), but is probably not very convincing even if sold as a naive solution.

Perhaps the simplest approach to dealing with this is, like we already require the user to describe the IP:port pairs on the network, to require the user to describe the peer to peer connections involved.

Naively speaking, let's add something like this to the configuration:

"channels": {
    "Producer": {
        "network": {
            "ProducerId": "@self",
            "1": "Consumer(1)",
            "2": "Consumer(2)",
            "3": "Consumer(3)"
        },
        "queueStream": "@stdio"
    },
    "Consumer(1)": {
        "network": { "ProducerId": "Producer", "1": "@self" },
        "queueProcessor": "@stdio"
    },
    "Consumer(2)": {
        "network": { "ProducerId": "Producer", "2": "@self" },
        "queueProcessor": "@stdio"
    },
    "Consumer(3)": {
        "network": { "ProducerId": "Producer", "3": "@self" },
        "queueProcessor": "@stdio"
    }
}

This boils down to a mapping from process (that is an instance of an archetype) to a mapping from parameter name to some description of how to implement that parameter. Strings for single variables, (nested if necessary) dictionaries for aggregate maps. Let's for completeness/backwards compatibility that anything that isn't claimed to be a channel gets the default global variable treatment.

Notation:

So, I'm sure there are flaws (and some embedded assumptions I'm still figuring out) in what I'm proposing but hopefully this gets the discussion going. Apologies this took me so long to post - it's been a long week (as usual these days it seems).

P.S you would also need to give a mapping from channel to port number wouldn't you - one port per archetype parameter (multiplied by level of nesting - 1 for nested aggregates)

fhackett commented 5 years ago

After some discussion, we decided that it is (for MVP at least) easier to implicitly make all processes connected. This re-uses a lot of what we already have, and with a little laziness in our implementation can behave equivalently to explicit configuration of different network topologies.

To summarise, here is what the config fragment might look like now:

"archetype-params": {
    "network": "channel",
    "queueStream": "stdio",
    "queueProcessor": "stdio"
}

The assumptions embedded here are:

rmascarenhas commented 5 years ago

We decided not to go this route.