ipfs / specs

Technical specifications for the IPFS protocol stack
https://specs.ipfs.tech
1.16k stars 232 forks source link

[WIP] RFC ###### - Protocol Driven Development #11

Closed daviddias closed 9 years ago

daviddias commented 9 years ago

Abstract

Introduction

Cross compatibility through several implementations and runtimes is historically an hard goal to achieve. Each framework/language offers different testing suits and implement a different flavour of testing (BDD, TDD, RDD, etc). We need a better way to test compatibility across different implementations.

Instead of the common API tests, we can achieve cross implementation testing by leveraging interfaces offered through the network and defined by the Protocol. We call this Protocol Driven Development.

In order for a code artefact to be PDD compatible

The objectives for Protocol Driven Development are:

In order to achieve compliance, we have to follow four main steps: 1 - Define the desired Protocol Spec that is going to be implemented 2 - Design the compliance tests that prove that a certain implementation conforms with the spec 3 - Once an implementation is done, capture the messages traded on the wire using that implementation, so that the behaviour of both participants can be replicated without the agent 4 - Create the Protocol Compliance Tests (consisting on injecting the packets/messages generated in the last step in the other implementations and comparing outputs)

Protocol Spec

Should define the goals, motivation, messages traded between participants and some use cases. It should not cover language or framework specifics.

Protocol Compliance Tests Spec

Defines what are the necessary “use stories” in which the Protocol implementation must be tested to assure it complies with the Protocol Spec. For e.g:

# Protocol that should always ACK messages of type A and not messages of type B
> A
< ACK
> B
> B
> B

A test would pass if the messages transmitted by an implementation follow the expected format and order and it would fail if format and order are not respected, plus if any extra message is transmitted that is was not defined.

Tests should be deterministic, so that different implementations produce the same results:

┌─────────┐     ┌─────────┐    ┌───────────────┐
│input.txt│──┬─▶│go-impl  │───▶│ output.go.txt │
└─────────┘  │  └─────────┘    └───────────────┘
             │  ┌─────────┐    ┌───────────────┐
             └─▶│node-impl├───▶│output.node.txt│
                └─────────┘    └───────────────┘

So that a diff between two results should yield 0 results

$ diff output.go.txt output.node.txt
$

Interchange Packet/Message Capture

Since most of these protocols define leverage some type of encoded format for messages, we have to replicate the transformations applied to those messages before being sent. The other option is capturing the messages being sent by one of the implementations, which should suffice the majority of the scenarios.

Protocol Compliance Tests Suite

These tests offer the last step to test different implementations independently. By sending the packets/messages and evaluating their responses and comparing across different implementations, we can infer that in fact they are compatible

Example use case - go-multistream and node-multistream tests

(experimenting, will report soon :) )

wking commented 9 years ago

On Tue, Jun 09, 2015 at 09:55:51AM -0700, David Dias wrote:

Protocol that should always ACK messages of type A and not messages of type B

A < ACK B B B

A test would pass if the messages transmitted by an implementation follow the expected format and order and it would fail if format and order are not respected…

I'd expect many of our protocols to be asynchronous, so “order” should just mean “order for this particular command thread”. For example,

> 1 A
> 2 B
> 3 B
< 1 ACK
> 4 B

should be a valid response. We'll want to ensure the protocol can handle this sort of thing to avoid needlessly serializing activity (this is the same round-trip latency problem outlined in 1, but here it's for “I'm ready to send another request even though there are requests in flight”).

Or are we only running a single command thread per connection in this test suite?

jbenet commented 9 years ago

@jbenet's comments

  • Expose a network interace

Can we be more specific? How about:

  • Expose a connection (duplex stream) interface, may be synchronous (online, interctive) or asynchronous.

"network" can mean connection, but can also mean more.

May not be the best way to place the "online" vs "offline" distinction, but should mention it somewhere.

(Btw, HTTP is an example of an asynchronous (or semi-asynchronous) protocol, because the REQ/RES pair can happen even if the network breaks. In practice, HTTP rides on TCP, and TCP is mostly an online protocol (because of the windowing), but if requests are small enough, HTTP/TCP behaves asynchronously, which is really good for all sorts of perf concerns.

  • Well defined process to test Protocol Spec implementations
  • Standard definition of implementation requirements to comply with a certain protocol
  • Automate cross implementation tests
  • Have a general purpose proxy for packet/message capture

Beautiful :+1: !

order are not respected

Hm, this is tricky, because we may have protocols that may have optional (SHOULD / MAY) things.

Maybe it makes sense to enable writing an implementation "distiller" (not the best name for this...) that produces the exact output we want to check

┌─────────┐     ┌─────────┐    ┌───────────────────┐    ┌───────────────┐
│input.txt│──┬─▶│go-impl  │───▶│go-impl distiller │───▶│ output.go.txt │
└─────────┘  │  └─────────┘    └───────────────────┘    └───────────────┘
             │  ┌─────────┐    ┌─────────────────────┐    ┌───────────────┐
             └─▶│node-impl├───▶│node-impl distiller │───▶│output.node.txt│
                └─────────┘    └─────────────────────┘    └───────────────┘

So the distiller could drop optional pieces, or know how to glean out the exact information.

This is --of course-- tricky too, because differences on how optional things are implemented may break other implementations. But this gives a bit of freedom that may be hard to get without?

So that a diff between two results should yield 0 results

$ diff output.go.txt output.node.txt
$

Yes, beautiful!! :+1:

jbenet commented 9 years ago

@diasdavid liking this very much!

daviddias commented 9 years ago

Awesome, thank you for the rad input!

I'd expect many of our protocols to be asynchronous, so “order” should just mean “order for this particular command thread”. For example, 1 A 2 B 3 B < 1 ACK 4 B

Good point! I believe that we can come up with a "Protocol Test Spec DSL" that covers this cases, such as:

So taking from that examples we would have:

# Protocol that should always ACK messages of type A and not messages of type B
> A
  {< ACK}
> B-1
[< Heartbeat]
> B-2
> B-3

This means that A must be ACK'ed, but it doesn't have to be the next message, Heartbeats are option and the B message stream has order and that must be respected.

Can we be more specific? How about:

Expose a connection (duplex stream) interface, may be synchronous (online, interctive) or asynchronous.

Yes we can, just updated :)

Hm, this is tricky, because we may have protocols that may have optional (SHOULD / MAY) things.

Indeed, I think we can accommodate this with the Protocol Test Spec DSL described above, so that once we have a "pristine implementation" (aka first implementation to appear that we trust it conforms with the spec),1: we extract the outputs for the tests in place, 2: take those outputs and use them as the comparison unit for other implementations 3: use a "protocol asserts" that instead of comparing both outputs in a rigid way, take what's defined on the Protocol Test Spec DSL and verify the "flow" was respected.

Apologies if the above is confusing, I've just finished doing the Protocol Compliance tests for MultiStream and I am writing now it as a "case study/example" to make how things work more evident.

wking commented 9 years ago

On Wed, Jun 10, 2015 at 09:43:39AM -0700, David Dias wrote:

I believe that we can come up with a "Protocol Test Spec DSL" that covers this case…

It would be awesome if we could find an already-written version of this DSL that already had some miles under its belt. I'm sure there are many things we won't think of in the first few passes. For example, @whyrusleeping was having latency-related issues with multiple block sends 1, and latency will be very important for things like optimistic transmission + cancel 2.

I'm not saying that designing a good protocol DSL isn't a worthy goal, I'm just saying that I expect it to be a hard problem, and worth a good amount of research in the hopes that someone else has already handled the heavy lifting.

  • Indentation to communicate a dependency (a ACK of A can only come after A is sent for e.g)
  • [ ] for messages that might or not appear (e.g heartbeats should be passed on the wire from time to time, we know we should get some, but not sure how much and specifically when).
  • { } for messages that will arrive, we just can't make sure if before of the following messages described

So taking from that examples we would have:

# Protocol that should always ACK messages of type A and not messages of type B
> A
  {< ACK}
> B-1
[< Heartbeat]
> B-2
> B-3

This means that A must be ACK'ed, but it doesn't have to be the next message, Heartbeats are option and the B message stream has order and that must be respected.

Can we only get heartbeats after B-1? I think it's better to make a more abstract ABNF-style definition:

protocol = handshake parallel-stream [close] handshake = … parallel-stream = request response [complete] …

but you'd need to tweak things somewhat because a response and a cancel-request could be in-flight at the same time (in different directions).

1: we extract the outputs for the tests in place, 2: take those outputs and use them as the comparison unit for other implementations

Since I favor a more abstract protocol definition, I don't think starting from a seed transaction is going to help all that much. I'd probably start writing up the protocol definition while looking at any existing specs for the protocol and/or the implementing code for a reference implementation.

daviddias commented 9 years ago

@wking agree that if there is something out there that can be used, it can be great. Going to ping some folks around and do some research on that. However, we will can start with the uses cases we have and what feels right and iterate over it (like its own Protocol?)

Here goes a brain dump from what I've been thinking, working and doing for MultiStream

PDD for MultiStream (example/use case)

Protocol

Defined here:

Protocol Compliance Tests Spec

Given the protocol spec, an implementation of the multistream-select protocol has to comply with the following scenarios:

1 - push-stream

In a push-stream example (one-way stream), we have two agents:

Compliance test 1 (human readable format):

# With a connection established between silent - broadcast
< /multistream/1.0.0       # multistream header
< /bird/3.2.1              # Protocol header
< hey, how is it going?    # First protocol message

2 - duplex-stream

In a duplex-stream example (interactive conversation), we have two agents:

Compliance test 2 (human readable format):

# With a connection established between interactive - select
< /multistream/1.0.0
> /multistream/1.0.0
> ls
< ["/dogs/0.1.0","/cats/1.2.11"]
> /mouse/1.1.0
< na
> /dogs/0.1.0
< /dogs/0.1.0
> hey

Wire out

Since this protocol is not fully plaintext, we have to capture the messages/packets that transmited by one of the agents to make sure we get the transformations right (and therefore doing development driven by what is going on in the wire, which is defined by the Protocol (PDD ftw!))

With a first implementation written, we can capture the messages switched on the wire, so that later, we can require other implementations to conform. For the multistream scenario, tests are organized by the following:

tests
├── comp                      # Where compliance tests live
│   ├── compliance-test.js
├── impl                      # Where specific implementation tests live, where we can get code coverage and all that good stuff
│   ├── interactive-test.js
│   └── one-way-test.js
└── spec                      # Spec tests are the tests were what is passed on the wire is captured, so it can be used in the compliance tests for all the implementations
    ├── capture.js
    ├── interactive-test.js
    ├── one-way-test.js
    └── pristine              # The pristine folder were those captures live
        ├── broadcast.in
        ├── broadcast.out     # A broadcast.out is the same as a silent.in, since there are only two agents in this exchange,
        ├── interactive.in    # the reason both files exist is to avoid mind bending when it is time to use the "in/out", it could get confusing
        ├── interactive.out
        ├── select.in
        ├── select.out
        ├── silent.in
        └── silent.out

Protocol Compliance Test Suit

The protocol compliance test suit for multistream-select can be found on tests/comp/compliance-test.js, each agent is tested alone with the input we have prepared on the previous step for it, once that agent replies to all the messages, we compare (diff) both the output generated and its "pristine" counterpart, expecting to get 0 differences.

I think now would be a good time to write this up in a .md file so more people can proactively contribute with their suggestions?

wking commented 9 years ago

On Wed, Jun 10, 2015 at 03:38:57PM -0700, David Dias wrote:

However, we [can] start with the uses cases we have and what feels right and iterate over it…

Yeah, that makes sense.

1 - push-stream

In a push-stream example (one-way stream), we have two agents:

  • 'broadcast' - where messages are emited
  • 'silent' - listener to the messages emited by the broadcast counterparty

Compliance test 1 (human readable format):

# With a connection established between silent - broadcast
< /multistream/1.0.0       # multistream header
< /bird/3.2.1              # Protocol header
< hey, how is it going?    # First protocol message

Is this just for making sure the silent listener doesn't barf into the socket or crash when you sent it these messages? Or is the broadcaster supposed to somehow know to write that protocol header and message?

Are push streams sent via broadcast-only protocols (e.g. UDP to a broadcast address and standard port) and picked up by silent listeners on that port? I'm also unclear about how the roles get assigned after e.g. a TCP connection is esablished. Is the connection-initiator automatically the broadcaster (e.g. in the UDP example, saying “hi everyone, here's a message”) or the silent agent (e.g. with TCP, saying “hi. I'd like to listen to anything you want to say”)? How do we encode these connection-order issues in the protocol-test DSL?

2 - duplex-stream

In a duplex-stream example (interactive conversation), we have two agents:

  • 'select' - waiting for connections, hosts several protocols from where a client can pick from
  • 'interactive' - connects to a select agent and queries that agent for a specific protocol

Compliance test 2 (human readable format):

# With a connection established between interactive - select
< /multistream/1.0.0
> /multistream/1.0.0
> ls
< ["/dogs/0.1.0","/cats/1.2.11"]
> /mouse/1.1.0
< na
> /dogs/0.1.0
< /dogs/0.1.0
> hey

Hmm, how do we differentiate push-streams from duplex-streams? Or is the fact that a stream is intended to be duplex/multiplex communicated out-of-band?

How do you leave a protocol? And how does this multiplex? Or do we fork off a new stream with each protocol change?

With a connection established between interactive - select

< 0 /multistream/1.0.0

0 /multistream/1.0.0 0 ls < 0 ["/dogs/0.1.0","/cats/1.2.11"] 1 /mouse/1.1.0 0 close < 0 closed 2 /dogs/0.1.0 < 1 na < 2 /dogs/0.1.0 2 hey 2 close < 2 closed

… tests are organized by the following…

Looks good to me, and you could always use symlinks (or IPFS!) to avoid duplicating data with matching in/out pairs.

You probably also need a way to prime the test implementation. E.g. if you want to test bitswap peers, how do you configure the tested peer to think it has (or doesn't have) a given DHT key? So you probably also need the tester to say “I'd like to launch $IMPLEMENTATION with $SEED_DATA listening on (or connecting to) $PORT”, and then have per-tested-impementation hooks to make that happen (and inform the tester when the implementation is up and listening for the “listening on” case?).

I think now would be a good time to write this up in a .md file so more people can proactively contribute with their suggestions?

Seems reasonable to me.

daviddias commented 9 years ago

@wking your questions show a very valid point, which is the fact that since we had some side channel conversations of the protocol, without describing its implementation on the Protocol Spec or not point to a reference implementation, questions like "how we know the role of each side", appear. A lot of that is answered here: https://github.com/ipfs/node-ipfs/issues/13#issuecomment-109802818 . One of my goals with PDD is also to answer those questions as soon as possible when we design a protocol, since we will have to ask ourselves what will be the questions made by whom is implementing it.

Just to clarify, multistream is to multiplex streams, it is not focused on the connection itself, so in order to reuse the connection for several protocols, we could do:

< /multistream-select/0.3.0
  > /multistream-select/0.3.0
  > /spdy/3.0.0
    < /spdy/3.0.0
    > "spdy-header-opening-a-stream-0"
      > /multistream-select/0.3.0
        < /multistream-select/0.3.0
        > "select another protocol, like /ipfs-bitswap/3.3.3"
    > "spdy-header-opening-a-stream-1"
      > /multistream-select/0.3.0
        < /multistream-select/0.3.0
        > "select another protocol, like /ipfs-dht/0.3.4"

You probably also need a way to prime the test implementation. E.g. if you want to test bitswap peers, how do you configure the tested peer to think it has (or doesn't have) a given DHT key? So you probably also need the tester to say “I'd like to launch $IMPLEMENTATION with $SEED_DATA listening on (or connecting to) $PORT”, and then have per-tested-impementation hooks to make that happen (and inform the tester when the implementation is up and listening for the “listening on” case?).

Indeed we have, I hadn't thought of how to represent it in text. In the MultiStream scenario, in order to properly tested it, we didn't only have to start the protocol but also register handlers, so that features get tested, you can see by the example that a /dogs/0.1.0 was registered, so that the request to connect it was successful, while the /mouse/1.1.0 wasn't, and therefore we tested for the "na" scenario.

I've moved the writing of PDD to https://github.com/ipfs/specs/pull/12, so we can avoid the copy and paste for commenting and reviewing :)

jbenet commented 9 years ago

@diasdavid +1 to the indentation