Closed willm closed 1 year ago
cc @nodejs/test_runner @cjihrig
This could be improved by parsing the standard output of the child processes that run each test file. There is a TODO in the code regarding implementing a TAP parser. IMO that is the best approach, but also requires the most work. If someone really wanted to, they could implement more light weight parsing that, for example, only parses the ending summary lines of each child process.
This could be improved by parsing the standard output of the child processes that run each test file. There is a TODO in the code regarding implementing a TAP parser.
@cjihrig happy to work on this. Could you point me to the TODO comment?
Thank you @aduh95. I'm gonna work on this 👍
I took a little look at this issue, and I was wondering if this can be approached by adding support for additional reporters other than the current TapStream
,
this will allow adding a plain JSON reporter that is already very easy to serialize/deserialize
possibly another reporter can be using worker_threads
postMessage to share data between the parent test and its children
this approach can also enable passing custom reporters that are implemented in userland as an option to the root test runner (assuming we define a common interface for a reporter) WDYT?
I agree with @MoLow!
this approach can also enable passing custom reporters that are implemented in userland as an option to the root test runner (assuming we define a common interface for a reporter)
I am in favor of this as well.
@cjihrig I'd love to hear your feedback regarding the reporter's approach.
@manekinekko would you like to implement some kind of parentPort.postMessage
reporter?
I took a little look at this issue, and I was wondering if this can be approached by adding support for additional reporters other than the current TapStream
@cjihrig I'd love to hear your feedback regarding the reporter's approach.
In my head I always pictured the test runner outputting TAP via a stream and other reporters being implemented as transform streams (this part kind of depends on having the TAP parser in place).
would you like to implement some kind of
parentPort.postMessage
reporter?
@MoLow I am happy to give it a shot. Could you provide a high-level design so I have a bigger picture of the different pieces?
In my head I always pictured the test runner outputting TAP via a stream and other reporters being implemented as transform streams (this part kind of depends on having the TAP parser in place).
This would totally be doable once the TAP parser is done. The current AST has all the info to run it through a codegen and spit out virtually any format.
In my head I always pictured the test runner outputting TAP via a stream and other reporters being implemented as transform streams (this part kind of depends on having the TAP parser in place).
I totally understand why implementing reporters as a transform stream makes sense,
my two arguments don't necessarily contradict that:
JSON.parse
, both in terms of performance and of usability.
IMHO it makes more sense that the base stream outputs JSON objects and then use a transform stream to output TAP (in case one wants tap output and not something else)--test
might leverage the ability to share memory between workers - which might be a little more efficient than serializing -> parsing -> serializing again
I am not sure about this one as multiple child processes can leverage multiple CPUs, and this wont allow thatTest runners in the ecosystem don't output JSON by default - ever, afaik. A number of them output TAP by default, though. TAP is the best choice.
I assume that it can be very hard to write a TAP parser that beats JSON.parse
I also think that we need to think in terms of streaming output, which JSON.parse()
would not handle. Streaming JSON parsing is possible, but I still think TAP is the right choice for the default output format.
ok, I've got your point :)
I've been experimenting with the new node:test
API and I had the same thoughts as @MoLow re: JSON.
Going with TAP as the native format seems to introduce unnecessary friction for consuming tools and it sounds like internally within node:test
as well.
In order to make sense of the test results of a file's test run, you need a TAP parser. In order to implement a proper TAP parser, you need a YAML parser. And parsing YAML correctly is very non trivial.
In implementing a TAP parser myself, I've shied away from full YAML parsing and decided to implement a parser for the subset of yaml that jsToYaml
can actually produce at the moment. But I can't help but feel like something is off here. YAMLs a complex format that's rather foreign in the node/js context. It seems node core would need a full yaml parser/encoder sub-lib in core to make this futureproof and that seems like quite a lift.
I also think that we need to think in terms of streaming output, which JSON.parse() would not handle. Streaming JSON parsing is possible, but I still think TAP is the right choice for the default output format.
Sure streaming JSON, but is that really the problem space here? I would imagine ndjson
could be used with newlines delimiting results.
For a stream emitting something like:
{ testPlan: { start: 1, through: 2 } }
{ testNumber: 1, test: 'ok', durationMs: 50.12312 }
{ testNumber: 2, test: 'not ok', durationMs: 50.12312, diagnostic: 'TODO # reason', failureType: 'testCodeFailure' }
could be parsed pretty trivially with built-in functions (no 10k+ LOC yaml parsing lib dependency) as a stream:
const testNdJSONStream = run();
export async function* readableStreamLines(
readableStream: NodeJS.ReadableStream
): AsyncGenerator<string> {
for await (const tapBytes of readableStream) {
if (!Buffer.isBuffer(tapBytes)) {
throw new Error(
`Didn't receive of bytes from TAPStream as expected`
);
}
yield tapBytes.toString('utf-8');
}
}
for await (const resultLine of readableStreamLines(testNdJSONStream)) {
const testResult = JSON.parse(resultLine)
}
Another idea I had when considering all of this, node could potentially respect the TAP 13/14 spec while side-stepping the YAML parsing problem.
TAP 13 spec basically just specifies that the contents between ---
and ...
are valid YAML and not much else
Currently (2007/03/17) the format of the data structure represented by a YAML block has not been standardized.
What if node:test
serializes contents in this block to a subset of flow style yaml instead of the subset of block style YAML that is presently being used?
This subset of flow style could also be valid JSON meaning that node:test
would be outputting valid TAP 13 per the spec while still being easily parseable from js without a yaml lib.
Something like:
TAP version 13
# Subtest: /foo/bar/combinators.test.js
not ok 1 - Static HTML JSX Rendering
---
{ "duration_ms": 4.15925, "failureType": 'subtestsFailed', "error": '1 subtest failed', "code": 'ERR_TEST_FAILURE' }
...
1..1
This is valid TAP 13, the embedded doc is valid YAML and it is far friendlier to parse from node.js.
completed via https://github.com/nodejs/node/pull/43525
Version
v18.3.0
Platform
Darwin willmunn-2 20.6.0 Darwin Kernel Version 20.6.0: Tue Feb 22 21:10:41 PST 2022; root:xnu-7195.141.26~1/RELEASE_X86_64 x86_64
Subsystem
test_runner
What steps will reproduce the bug?
I wanted to try the new test runner, being a long term tape user, this was pretty exciting as the apis are very similar. It seems the test summary at the end of the tap output is recording test counts as the number of test files. I had a go at a simple fizz buzz implementation:
import test from 'node:test'; import assert from 'assert';
node --test
returns the following output:note that the the output suggests that only 1 test has been run. After adding another test file, I noticed that the output is actually counting the number of files, not the number of
test()
blocks or the amount of assertions.If I add a
test.skip
to one of the tests, it will still report that there are 0 skipped tests.How often does it reproduce? Is there a required condition?
No response
What is the expected behavior?
I personally think the best solution would be that The TAP output reports the number of
test()
blocks rather that the number of test files. So for my example:What do you see instead?
Additional information
Note that my suggestion is actually different to what the tape module does, this reports counts based on the number of assertions. I personally feel that number of
test
blocks makes the most sense.