MatrixAI / Architect

Programming Language for Type Safe Composition of Distributed Infrastructure
Apache License 2.0
1 stars 0 forks source link

Protocol Specification - Communication Semantics Behaviour #6

Open olligobber opened 6 years ago

olligobber commented 6 years ago

A demo of Session Types has been created at ProtocolResearch/SessionTypesDemo. It may need refining, as currently a single protocol may be expressible in multiple ways, and thus comparison will be incorrect. With the tools present in this demo, it appears possible to encode basic TCP, UDP and HTTP, assuming that we can encode the structure of the packets being communicated.

CMCDragonkai commented 6 years ago

Philip Wadler did most of the work on session types and blame & contracts.

Work on type-safe protocol specification is everywhere on the internet. But I found a 2014 thesis about this: http://simonjf.com/writing/bsc-dissertation.pdf (involving Haskell)

CMCDragonkai commented 6 years ago

It would be nice if the protocol specification could cover Automatons that don't directly communicate over the network. Instead they communicate with file paths and file descriptors (or file descriptor paths).

Basically imagine a process that takes a file path and produces another file at another file path. Although these kinds of processes tend to be outside of the domain of networked systems. However big data workloads are starting to do these kinds tasks where they process a large amount of files in a directory.

CMCDragonkai commented 6 years ago

Consider the problem of sub-schemas: https://raml.org/developers/raml-200-tutorial#body-parameters

While a HTTP message can be part of an overall interaction, within the HTTP message, it may encapsulate sub-schemas such as JSON or CSV or YAML. These are generally represented via media types https://www.iana.org/assignments/media-types/media-types.xhtml

Some media types have official and unofficial schemas developed for them. Such as XML Schema or JSON schema. More info about this is here: https://github.com/raml-org/raml-spec/blob/master/versions/raml-10/raml-10.md/#using-xml-and-json-schema

The other problem is that an Automaton may be expecting data over a TCP socket. Because it's literally using BSD socket syscalls. http://man7.org/linux/man-pages/man2/socket.2.html Think of a C-based Automaton, where the executable creates the socket structure, binds it to an address, enables listening, then starts accept connections, spinning up other threads to handle receiving and sending on these connections. In such a case, within these TCP connections they may be speaking HTTP.

However it is equally possible that the C application instead creates a unix domain socket, binds it to a socket file on a filesystem, and listens... etc. For example here's a Go app that does this: https://gist.github.com/teknoraver/5ffacb8757330715bcbcc90e6d46ac74 Regardless of how unlikely this is, it is still an issue of having the correct foundations. These 2 applications are not compatible with each other, because one is speaking HTTP over Unix Domain Sockets, while the other is speaking HTTP over TCP.

Usually this is resolved by the fact that the address scheme for something running on TCP is very different from the address scheme for something running on Unix domain sockets. Because the address scheme is different, they are made incompatible each other. However Automaton addressing is automatically handled by the Architect language and the Orchestrator. The Orchestrator must be able to resolve the dependencies from the abstract content addressable service to a protocol-specific address.

If we were support both applications. At some point, one could imagine that the injected address into the HTTP over Unix to be a file path, while the injected address for the HTTP over TCP to be an IP + Port combination, but at the Architect language the expression does not change except for how the types are expressed.

More info on how sockets work here: https://www.ibm.com/developerworks/aix/library/au-tcpsystemcalls/

CMCDragonkai commented 6 years ago

It may be good if we have a system that first just lets us cheat it by using undefined (or basically saying we don't have a impl for this in spec, but we have the impl) and just tagging it saying yea we knoe this TCP/IP. And in the future be able to extend it to actually implement the protocol spec for the TCP/IP.

CMCDragonkai commented 6 years ago

So we could simplify this problem by only considering protocols at the level of a bytestream. That is assuming the underlying transport provides a bytestream, all protocol specification is specified from that foundation. Thus HTTP frames + whatever content inside can be specified and then converted to a "bpf" linter. As to the question how to then apply the monitor, I could imagine we just have constructors with holes we expand on later.

For example, assuming TCP as a baseline. We can do something like TCP (HTTP (...)) .... The last ... can be expanded in later versions. I'm also wondering if the hierarchal specification works better vs a flatter composition with some operator.

Also this means we ignore out-of-band control constructs or their semantics, we only consider their syntax (which is what BPF can do), verifying their semantics has to be solved another way, perhaps with a combination of AOT checking (model checking) and some other runtime monitoring.

CMCDragonkai commented 6 years ago

So there are 2 purposes of the Protocol Specification:

  1. To act as a declarative specification of the communication behaviour of services. This is what allows us to figure out whether 2 services can actually compose together. We use session type research because it appears to be the state of the art in terms of specification of communication behaviour.
  2. To act as a declarative specification of the operational network performance behaviour in terms of being able to compile the protocol specification into BPF code that can be used to do runtime instrumentation of the network communication between Automatons.

Session types don't concern themselves with the runtime network performance of the communication. There's no notion of latency or throughput... etc. These attributes are actually later specified via quality of service constraints, but they are not directly specified inside the protocol specification.

To achieve the 2 purposes. We need to have 2 sublanguages in the Protocol Specification.

To do the first, we need to apply session type theory to network protocols that are currently in use. The most common being TCP + HTTP + JSON. This means a session type actually nests other session types. That is why TCP doesn't really provide choices (we can think of it as a single choice), the HTTP provides choices (GET/POST/PUT/DELETE) and routes, and then at the JSON layer we may have further choices. I think I'm not entirely sure the relationship between choices and different API calls. But here's an example:

We have an HTTP service that takes JSON and answers in JSON. At the HTTP level, the choices are GET and POST. This means the HTTP service offers "GET" and "POST". This doesn't actually mean that the HTTP service has to send an "offer message", this is just at the type level. So the client now chooses either a GET or POST message to send. When the HTTP service receives a GET message, it will return a response. The response is itself an HTTP message that contains a JSON body. That JSON body can just provide data. In another situation, the client sends an HTTP POST, this is already a choice taken at the HTTP session level. But within the HTTP POST, there's a JSON message. That JSON message has been constructed in a particular way according to some JSON schema.

{
    "title": "Person",
    "type": "object",
    "properties": {
        "firstName": {
            "type": "string"
        },
        "lastName": {
            "type": "string"
        },
        "age": {
            "description": "Age in years",
            "type": "integer",
            "minimum": 0
        }
    },
    "required": ["firstName", "lastName"]
}

The above was taken from JSON schema as an example. The particular construction of the JSON at the client level was also a choice among potential possibilities. Only that the possibilities are sort of infinite. The client may have packaged up these choices:

{ firstName: "abc", lastName: "abc" }
{ firstName: "abc", lastName: "abc", age: 1 }
{ firstName: "abc", lastName: "abc", age: 2 }
{ firstName: "dfsg", lastName: "dfeg", age: 10 }
... and so on

The above are all potential instantiations among the json schema type. Are they not also choices? If we take an even simpler json schema, one that specifies an enum:

{
  "type": "string",
  "enum": ["red", "amber", "green"]
}

This schema is more obviously closer to a session type, since there are now only 3 choices:

"red"
"amber"
"green"

Taking this to the Protocol Specification, this means there are nested session types, but not nested as at the same protocol layer. One can choose GET/POST, and then within POST choose, red or amber or blue.

One problem with this is dealing with finite vs infinite choices. The session type theory tends to focus on finite choices. But infinite choices are usually dealt with by abstracting on top of the infinite choice. For example a +1 service. Where a service can take any number and +1 to it. Still not sure how this will work yet.

So to sum up, the challenges for 1 is figuring protocol layering in a declarative choice/offer sense. But the fundamental language doesn't really change between 1 protocol layer and another. They are all variants of offering and choosing, just that in some cases an instances of choices can become a combinatory explosion.

For purpose 2, we need a different kind of language, one that specifies the underlying grammatical constructs that can later be compiled into packet filtering logic. One way to implement this is through some sort of generic parser type that can be implemented via various parser implementations. So for example, a simple TCP parser is one that simply specifies the initial TCP header frame. While a JSON parser can be used within the HTTP frame. Some of the parsers can be compiled directly to BPF code, while others should be left to userspace code. An example of a BPF code that parses HTTP: https://github.com/iovisor/bcc/tree/master/examples/networking/http_filter There it shows C code that will be compiled to BPF assembly and instrumented via Python (as the actual HTTP contents is returned to Python userspace code). This means runtime performance monitoring will be performed at a sample rate and asynchronously.

To do this the protocol spec has to for every layer session type, have a corresponding parser grammar that can then be compiled to the byte level grammar. However we want to take advantage of existing parsers. If we define a generic parser spec, we can write an impl for certain protocols like TCP/HTTP that compiles to BPF, while other parsers for application level protocols (JSON) may operate at userspace. The userspace in this case would have to be Haskell code. Since the compiler is built in haskell, we could expose haskell code directly here, and other Haskell modules can take over. If needed, those modules could bind to C code and run those.

CMCDragonkai commented 6 years ago

Regarding subtyping of Protocol Specification.

The subtyping of session types seems to support polymorphism of Automatons. The idea that we can swap out an Automaton composition for a different Automaton if that Automaton speaks the same protocol or a compatible protocol.

We need to define exactly what subtyping means however. Because I know about nominal and structural subtyping and the alternative of row polymorphism and each has slightly different behaviour. The paper on gradual session types http://homepages.inf.ed.ac.uk/wadler/papers/gradsess/gradsess.pdf appears to talk about covariance and contravariance (use Ctrl+F) so I would like to know whether the subtyping behaviour is based on existing subtyping theory or some new kind of subtyping methodology.

Subtyping vs Row Polymorphism: https://cs.stackexchange.com/questions/53998/what-are-the-major-differences-between-row-polymorphism-and-subtyping

olligobber commented 6 years ago

The paper on gradual session types treats session types as the type of a channel. In Architect it makes more sense to think of session types as the type of the interface an automaton presents. This results in two opposing views of subtypes, since being able to substitute a channel is the opposite of being able to substitute the peer that is communicating. The code and documentation for the session types demo have been updated in 47795336d9689f1f83d6a759f99ed7c88cb32dd0 to match this.

CMCDragonkai commented 6 years ago

The ABCD group is a research group dedicated to the research of session types. http://groups.inf.ed.ac.uk/abcd/

They published a 2017 report on the current state of session types and the different projects that are currently in play: http://groups.inf.ed.ac.uk/abcd/meeting-december2017/abcd-report-2017.pdf

They have been funded until May 2019.

CMCDragonkai commented 6 years ago

Found a couple links that relate to purpose 2. We might split this out into a separate issue to track:

The research group primarily interested in this purpose 2 appears to be http://langsec.org/

CMCDragonkai commented 6 years ago

I've split off the conversation about purpose 2 network performance behaviour to: https://github.com/MatrixAI/Architect/issues/16

olligobber commented 6 years ago

The session types demo has been updated to feature loops and concatenation of session types, thus increasing their expressiveness (we can now encode infinite protocols such as TCP and context free protocols such as HTTP), but decreasing their uniqueness (protocols could be encoded multiple different ways leading to false negatives when detecting compatibility).

CMCDragonkai commented 6 years ago

The ability to do fixed point recursion "loops" requires solving the equality problem: https://github.com/MatrixAI/Architect/issues/20 and the ordering problem: https://github.com/MatrixAI/Architect/issues/19