Compatibility and Interoperability Test Suite

yurishkuro commented 6 years ago

Objective

Have a standard test suite that can take 2 or more "node", and execute a series of transactions that traverse each system, with the purpose of verifying that they exchanged tracing headers in such a way that they were able to interoperate.

Benefits

By defining the tests and expectations described below we essentially define functional requirements of what we expect to happen in different interop scenarios, something that we're currently missing from the spec and which, imo, causes many round-about discussions of implementations without clear requirements.

Details

Node

Represents a microservice instrumented with a certain tracing library / implementation. Comes packaged as a Docker container that internally runs the tracing backend (or a proxy) and a small app that: a. has a /transaction endpoint that executes the test case transaction b. has an /introspection endpoint used by the test suite driver to verify that the respective tracing backend has captured the trace

Transactions

A transaction is described as a recursive set of instructions to call the next Node in the chain or to stop. E.g. it might look like

{
  callNext: {
    host: "zipkin", // name of the Node container running ZIpkin app, reachable via this host name
    callNext: {
        host: "jaeger",
        callNext: null
    }
  }
}

Running this transaction would execute a chain zipkin -> jaeger. When a Node receives such request, it looks for the nested callNext fragment and calls the next Node with that nested (smaller) payload. The last node will receive an empty request so it simply returns.

There also can be a convention that each Node's response contains the trace/span ID it observed/generated, again as a recursive structure, e.g.

{
  traceId: "...",
  spanId: "...",
  next: {
    traceId: "...",
    spanId: "...",
    next: null
  }

This would allow the test driver to interrogate the introspection endpoint using those IDs.

Verifications

Test suite driver calls /introspection endpoint for each Node to retrieve captured trace(s) in some canonical form (just enough info for the test). If /transaction responses contain trace/span Id, it can do some validation.

Test Suite

The test suite is defined as a list of scenarions, e.g.

vendor1, vendor1, vendor1 (i.e. a single vendor site)
vendor1, vendor2, vendor1 (cross-site transaction)
etc.

Each scenario is instantiated multiple times (test cases) by labelling different vendors with roles from the scenario, e.g.

scenario 1, test case 1: vendor1 = zipkin
scenario 1, test case 2: vendor1 = jaeger
scenario 2, test case 1: vendor1 = zipkin, vendor2=jaeger
etc.

Each test case runs and validates a single transaction, and checks different modes of participation in the trace.

Parameterization

The test suite framework can be also used to test multiple implementations of the tracing library from a given vendor, e.g. in different languages. This can be implemented as either different Node containers (e.g. zipkin_java, zipkin_go), or a single container controlled by env variables.

Participation Modes

The nodes can also support different trace participation modes, at minimum:

respect and reuse incoming trace ID
record incoming trace ID as correlation field, but don't trust it, start a new trace

If the test driver knows ahead of time which participation mode a given Node supports (these can again be parameters to the Node), it can validate the expected behavior.

Prerequisites

Each vendor must be able to provide a Docker image (or several) to act as a Node in the test suite. Ideally the containers should be fully self-contained, i.e. do not require external connectivity. It's possible to implement them as proxies to hosted tracing backends if necessary, but it will make the tests less reliable if those hosted backends are unavailable at times.

It's crazy / impossible

Jaeger internally uses an approach very similar to this one for many of its integration tests, in particular those that test compatibility of client libraries in different languages. Uber released a framework https://github.com/crossdock/crossdock that helps orchestrating these tests and permutations using docker-compose.

SergeyKanzhelev commented 6 years ago

do you have suggestion on the format of /introspection?

yurishkuro commented 6 years ago

It depends on what validations we want to implement. For example, if all nodes participate by trusting the inbound trace ID, then we may not need to call /introspection at all, because the final response the /transaction endpoint from the root node will contain trace IDs observed by every node, and we can simply ensure they are all the same ID.

If we want more advanced validation, then I'd assume the driver to send requests like /instrospection?traceID=xxx or /introspection?correlationTraceID=xxx (for distrustful nodes). The output is just a short summary of the trace that is required for validation.

We'd need to brainstorm this part, I don't have a complete answer.

For example, one other type of validation we might want is that causal relationships to the parent span are captured correctly. So the response to the first introspection request above would have to include not only trace/span ID, but also parent ID, so that the driver could validate the unbroken trace. However, this implies that all nodes would be able to answer such question, i.e. all have the ability to record parent span ID. I am not sure if it's possible, but on the other hand it could be just an optional capability that the node exports to the validator.

AloisReitbauer commented 6 years ago

I think we definitely need this and should work on a first version for the next workshop. I also agree that a lot of the theoretical discussions will go away once there is code that actually implements the spec.

I would first focus on testing key use cases handled by a single provider i.e. forwarding a trace context correctly. If we can get this done we have made a significant step forward.

I not a big fan of the Docker image approach because it is hard to provide a full tracing system in a single docker file. I rather have provide the endpoints and keep the implementation details to the vendor. Obviously, I could just forward a request from the container to the backend.

@yurishkuro would you be open to provide a first system to get our feet wet, so we can discuss next steps at the next workshop?

yurishkuro commented 6 years ago

The Docker image does not need to contain the complete tracing system. We are primarily testing interop of instrumentations, so the image can include the test app, instrumentation, and a simple in-memory backend for storing traces. Because the tests are organized as multi-hop requests through Nodes, trying to orchestrate this with just external API endpoints is much harder. With docker images everything is local, docker-compose takes care of network wiring.

Unfortunately, I don't have a lot of time to allocate to this implementation. Most of the code already exists in the form of Jaeger cross-language integration tests, which can be repurposed to cross-vendor tests.

https://github.com/jaegertracing/jaeger-client-go/tree/master/crossdock

And here's en example of actually defining two types of tests ("behaviors" in crossdock parlance): https://github.com/jaegertracing/jaeger-client-java/blob/master/jaeger-crossdock/docker-compose.yml

yurishkuro commented 6 years ago

I am progressing with the test suite.

Q: the LICENSE file says:

Contributions to Test Suites are made under the W3C 3-clause BSD License

Is this going to be a problem for vendors who want to submit something to run in the official tests? Why can't we use Apache 2 license? Created a separate issue #94.

yurishkuro commented 6 years ago

This is the first cut of the compliance tests:

https://github.com/yurishkuro/distributed-tracing/tree/compliance-tests/tests

yurishkuro commented 6 years ago

@SergeyKanzhelev @AloisReitbauer any thoughts on the prototype?

SergeyKanzhelev commented 6 years ago

@yurishkuro looking at it today. Thank you for putting something working together.

SergeyKanzhelev commented 6 years ago

Can you please add "getting started" with commands to run?

yurishkuro commented 6 years ago

added to tests/readme

SergeyKanzhelev commented 6 years ago

Few more packages needed. Not sure if there are setting for go compiler that can auto-download those:

git clone https://github.com/crossdock/crossdock-go $GOPATH/src/github.com/crossdock/crossdock-go
git clone https://github.com/davecgh/go-spew.git $GOPATH/src/github.com/davecgh/go-spew
git clone https://github.com/golang/net.git $GOPATH/src/golang.org/x/net

yurishkuro commented 6 years ago

ah, sure, I didn't set up a dependency manager yet. You can install these packages via go get. Let me try to add dep.

yurishkuro commented 6 years ago

actually, I think you only need to go get github.com/crossdock/crossdock-go, it's the only dependency so far, and running dep didn't add much value. Updated readme.

yurishkuro commented 6 years ago

Second iteration of the test suite: https://github.com/yurishkuro/distributed-tracing/tree/compliance-tests/tests

Main changes:

added ability to configure nodes based on env, e.g. TRUST_TRACE_ID, TRUST_SAMPLING, etc. (example in docker-compose.yaml)
added templates for different behaviors (test scenarios), like "malformed trace context"
mostly implemented behavior trace_context_diff_vendor using the reference implementation running in two modes, (a) trusting and (b) not trusting inbound trace IDs.
added the ability to run tests in real Docker containers (make crossdock, sampled output below)

At this point it's possible to start adding more specific tests for behaviors, but the reference implementation is currently very naive and not compliant, e.g it doesn't really check tracestate, instead it fully relies on traceparent, so it needs to be improved (there are TODOs in the code).

@adriancole would be interesting to test with your Java implementation if you could build an image. We need to re-implement the actor/ module in Java so that it can be similarly used as the main for the container and only leave the api.Tracer pluggable.

Executing Matrix...

S [malformed_trace_context] refnode→ (actor=refnode driver=refnode) ⇒ not implemented
S [malformed_trace_context] refnode→ (actor=refnode1 driver=refnode) ⇒ not implemented
S [missing_trace_context] refnode→ (actor=refnode driver=refnode) ⇒ not implemented
S [missing_trace_context] refnode→ (actor=refnode1 driver=refnode) ⇒ not implemented
✓ [trace_context_diff_vendor] refnode→ (actor=refnode driver=refnode) (9/9 passed, 0/9 skipped)
   ├ ✓ ⇒ same trace ID
   ├ ✓ ⇒ spanID is not empty
   ├ ✓ ⇒ ParentSpanID equal root spanID
   ├ ✓ ⇒ span is sampled
   ├ ✓ ⇒ same downstream traceID
   ├ ✓ ⇒ downstream span is sampled
   ├ ✓ ⇒ modified tracestate
   ├ ✓ ⇒ non-empty vendor key
   └ ✓ ⇒ vendor key 'ref' in the first position
✓ [trace_context_diff_vendor] refnode→ (actor=refnode1 driver=refnode) (10/10 passed, 0/10 skipped)
   ├ ✓ ⇒ different trace ID
   ├ ✓ ⇒ trace ID is in correlationID
   ├ ✓ ⇒ spanID is not empty
   ├ ✓ ⇒ ParentSpanID equal root spanID
   ├ ✓ ⇒ span is sampled
   ├ ✓ ⇒ downstream traceID equal 1st actor's traceID
   ├ ✓ ⇒ downstream span is sampled
   ├ ✓ ⇒ modified tracestate
   ├ ✓ ⇒ non-empty vendor key
   └ ✓ ⇒ vendor key 'ref' in the first position
S [trace_context_same_vendor] refnode→ (actor=refnode driver=refnode) ⇒ not implemented
S [trace_context_same_vendor] refnode→ (actor=refnode1 driver=refnode) ⇒ not implemented

19/19 passed (6/25 skipped)

Tests passed!

codefromthecrypt commented 6 years ago

sorry meant to reply. In a crunch but interested in this.. worst case can look at it during the workshop

SergeyKanzhelev commented 6 years ago

During the workshop we discussed that the approach with docker container may not work. Concerns are:

In order to test cloud infrastructure - there should be a cloud resource created and managed. Docker container will know about the endpoint, but will not be able to create and manage it's own without proper secrets. We do not want to force implementors to create and maintain secrets for all clouds
If we need to extract spans - for "some company presented in workshop" you'd need to know a secret that is not ideally should be shared.

So test harness should be something you can run locally (like a container or easy runnable app) to test private implementations and produce a report. And it should work against an endpoint, not necessarily manage the target docker.

One discussion is whether we need to have live reports of vendors. "Live" compliance test results may be generated as CI on this repo or by vendors uploading results to some central place.

Ending this comment... moving to the discussion of test cases to validate that http endpoint approach will work and we do not need multiple "chained" containers.

SergeyKanzhelev commented 6 years ago

Test suites are authored in the notes document for the workshop: https://docs.google.com/document/d/1Zh871qWTew8Rzhz6jhFW0nxeC1ax1kAovQW8RFF5bCA/edit#

One note we will most probably implement them in python as most platform and vendor independent language.

SergeyKanzhelev commented 5 years ago

I'm closing this as we have a test suite implemented. Let's open separate items for improving existing test suite if needs be

w3c / trace-context