This document is a draft proposal for Yet Another (Test) Report Format (YARF). It is intended to represent test results from a variety of test automation tools as well as manual tests.
Test reporting tools that process and display test results typically have multiple requirements:
YARF is represented as a "stream" of TestNode
objects that describe tests. While the stream of objects is a linear sequence of nodes, their id
and parentId
properties can be used to construct a hierarchy (tree) of test nodes.
The stream can be encoded as NDJSON, a JSON array or an array of objects.
Below is the format of the TestNode
objects (in TypeScript) This can be translated to a more portable format such as JSON Schema later.
/**
* A test node is either a leaf in a tree, or a node containing other nodes.
*/
type TestNode = {
// The ID of the test node. This must be unique within a test execution (message stream).
// The ID *may* be different for a new execution of the same test.
id: string,
// The ID of the parent node. This is `undefined` for the root node
parentId?: string
// A string representing the type of the node. This is specific to the test automation tool.
type: string,
// A reference back to the source code where this node came from. The format of
// this field is a URI. Examples:
// source-reference:file://example.file?line=12&column=13
// source-reference:java:com/example/Example#testMethod()
// A separate sub-spec might be needed for these URIs - maybe there is already a spec
// or defacto format we can use (like the URLs IDEs print in stack traces).
sourceRef: string
// A stable ID that does not change between test execution. This field will be empty until we've built a tool
// that can compute stable IDs for tests even when its name and line number changes from one revision to another.
entityId?: string
// The human readable name of the test node. For file system nodes this is the file name. For classes
// it is the class name, for methods it is the method name etc.
name: string,
// The timestamp of when this node started executing. Only present for leaf nodes.
// Formatted as an ISO 8601 string (ideally with millisecond precision)
timestamp: string,
// How long it took to execute. Only present for leaf nodes.
duration?: Duration,
// The status of the test execution. Must be present for leaf nodes.
// The field may be specified by container nodes.
// If the field is absent for a container node, consumers should compute this
// the aggregated result from child nodes.
// TODO: Document the algorithm for computing aggregate results.
result?: Status,
// Attachments collected during test execution
attachments: Attachment[]
// Tags attached to the test node
tags: string[]
}
/**
* An attachment represents an arbitrary piece of data attached to a test node. It can be used for images (screenshots), logs etc.
*/
type Attachment = {
// The body of the attachment.
body: string
// The content encoding of the body.
contentEncoding: AttachmentContentEncoding
// The IANA media type of the body or data behind the url
mediaType: string
}
enum AttachmentContentEncoding {
// The content should be interpreted as-is (as text)
IDENTITY = 'IDENTITY',
// The content is binary, encoded with Base64
// https://datatracker.ietf.org/doc/html/rfc4648
BASE64 = 'BASE64',
}
type Timestamp = {
seconds: number
nanos: number
}
type Duration = {
seconds: number
nanos: number
}
Each testing tool will have its own tool-specific format for representing tests and test results. The examples below illustrate how
these representations could be mapped to TestNode
objects.
. -> { id: 'a', parentId: undefined, type: 'directory' }
├── features -> { id: 'b', parentId: 'a', type: 'directory', name: 'features' }
│ └── bar -> { id: 'c', parentId: 'b', type: 'directory', name: 'bar' }
│ └── three.feature -> { id: 'd', parentId: 'c', type: 'file', name: 'three.feature' }
├── lib -> { id: 'e', parentId: 'a', type: 'directory' }
│ └── spec -> { id: 'f', parentId: 'h', type: 'directory' }
│ └── my_spec.rb -> { id: 'g', parentId: 'i', type: 'file' }
└── src -> { id: 'h', parentId: 'a', type: 'directory' }
└── test -> { id: 'i', parentId: 'k', type: 'directory' }
└── java -> { id: 'j', parentId: 'l', type: 'directory' }
└── com -> { id: 'k', parentId: 'm', type: 'directory' }
└── bigcorp -> { id: 'l', parentId: 'n', type: 'directory' }
└── MyTest.java -> { id: 'm', parentId: 'o', type: 'file' }
# features/bar/three.feature
Feature: Three # -> { id: 'n', parentId: 'c', type: 'gherkin-feature', name: 'Feature: Three' }
Scenario: Blah # -> { id: 'o', parentId: 'n', type: 'gherkin-scenario', name: 'Scenario: Blah' }
Given blah # -> { id: 'p', parentId: 'o', type: 'gherkin-step', name: 'Given blah' }
Scenario Outline: Blah # -> { id: 'q', parentId: 'n', type: 'gherkin-scenario-outline' }
Given <snip>
When <snap>
Examples:
| snip | snap |
| red | green | # -> { id: 'r', parentId: 'q', type: 'gherkin-step', status: 'passed' }
# -> { id: 's', parentId: 'q', type: 'gherkin-step', status: 'failed' }
Rule: Blah # -> { id: 't', parentId: 'n', type: 'gherkin-rule' }
Scenario: Blah # -> { id: 'u', parentId: 't', type: 'gherkin-scenario' }
// src/test/java/com/bigcorp/MyTest.java
class MyTest { // -> { id: 'v', parentId: 'm', name: 'com.bigcorp.MyTest' }
@Test
public void something() { // -> { id: 'w', parentId: 'v', status: 'passed', name: 'something()' }
}
}
# lib/spec/my_spec.rb
describe 'something' do # -> { id: 'w', parentId: 'g', type: 'rspec-context' }
it 'something else' do # -> { id: 'x', parentId: 'w', type: 'rspec-example' }
end
end
Consumers of a TestNode
stream should process the objects according to their needs. Below is an example of an imaginary application JoyReports
that is only interested in rendering test results from Cucumber Scenarios:
const joyReportsScenarioById = new Map<string, JoyReportScenario>()
for(const testNode of testNodes) {
if(testNode.type === 'gherkin-scenario' || testNode.type === 'gherkin-scenario-outline') {
joyReportsScenarioById.put(testNode.id, new JoyReportsScenario(testNode))
}
if(testNode.type === 'gherkin-step') {
joyReportsScenario = joyReportsScenarioById.get(testNode.parentId)
joyReportsScenario.addJoyReportsStep(new JoyReportsStep(testNode))
}
}
In order to support the YARF specification, each testing tool will need a corresponding mapping tool that translates the native test result format into YARF.
Cucumber's native test result format is Cucumber Messages.
The Cucumber Open project would build a separate library (probably in TypeScript) that can read a stream of Cucumber Messages and output a stream of TestNode
objects.
This library could then be used in existing tools:
There is more than one way to map a Gherkin Document to a YARF streams, especially for Background
and Scenario Outline
. We could experiment with different mapping strategies. The nice thing is that the output format is the same, just the shape of the tree will differ. This means consumers (such as React components) would not need to be adapted to different mappers.
While the YARF format can be used to build a UI to render tests, it doesn't contain all the information to provide a high
fidelity display of the source of the test. For example, the YARF format does not include Cucumber's DataTable
or DocString
.
In order to make a high fidelity display of a test, the source representation of the test must be used for high fidelity rendering.
To use Cucumber as a further example, Gherkin provides a GherkinDocument
data structure that represents the full abstract syntax tree (AST) for a .feature
file.
A <GherkinDocument />
React component could be used to render the whole document, and a <Scenario />
component could be used to
render just a Scenario
deeper in that AST. When we render a <Step />
, we want to display the status
, which isn't present in the
AST, but it is present in one of the YARF TestNode
objects.
This is where the TestNode#sourceRef
property comes in handy. A Map<string,TestNode>
map can be built by iterating over the TestNode
objects, and the <Step />
component could use its Step#id
to look up the corresponding TestNode
in that Map.
Some testing tools (like JUnit) are able to produce the same TestNode#id
for several executions.
Some testing tools (like Cucumber) do not have this ability.
Since the TestNode#id
field may be different for each test execution, it should only be used to build
a hierarchy of tests, and not for tracking the test across multiple exec
Some systems may want to report on the "flakiness" of a test over time, across multiple revisions of that test. For example, a test named
my_little_test
on line 48
in revision r10
may have moved to line 56
in r11
, and in r12
it may have been renamed to my_medium_test
.
Still, it could be considered to be the same "conceptual" test. This is where the TestNode#entityId
comes in. If this ID is stable
from test execution to test execution, across revisions, then flakiness and other trend metrics could be computed.
Computing a stable entityId
is possible, using techniques such as lhdiff. This is something
we may want to add support for in a separate tool at some stage. The field can be left empty for now, but if such a tool becomes available,
it can be used to populate this field.
Alternatively, mappers could assign a value to this field using other less reliable algorithms.
This document is based on the following discussions and documents
We should reach out to open source communities as well as people in SmartBear who might be interested in shaping the spec