RFC: Spinal Tap (alternative proposal)

Krinkle commented 3 years ago

Hi everyone,

Don't worry if you haven't kept up with recent developments here or don't remember how js-reporters works. I'm proposing a new direction, and everything you need to know is right here. What follows is a progress update on how everyone is doing, and what we can do next.

What is this?

It's a proposal for adopting TAP, a text format for reporting progress and results of software tests. Unlike the current iteration of js-reporters, TAP is a mature and already widely-adopted protocol.

Avoid this:	Do this:

Remind me, why?

For test runners and browser orchestrators :

Significantly reduce maintenance cost and version churn for supporting frameworks (qunit, mocha, jasmine, tape, etc.).
Not just lower, but remove the the barrier for supporting new frameworks (or major upgrades to existing ones). Zero code required, given you'd not need anything besides reading whatever TAP text from the browser or CLI console output. Frameworks either provide this by default, or let a plugin do it (a plugin to their framework, agnostic of your runner or orchestrator).
Exposure to a large and growing ecosystem of TAP-based formatters and artefact generators. See npmjs.org Search and tape Reporters for just a few of these. E.g. pretty formatters from or for Mocha, tape, ESLint, WebdriverIO, could then be plugged in as a Karma reporter – and vice versa! Support for current and future continuous integration and other cloud service would likely come for free (e.g. JUnit formatter for Jenkins, TeamCity formatter, etc.).
Improve longevity and impact of your community by making plugins much more widely usable.

For test frameworks:

Allow other reporters and exporters to be plugged into your framework, directly with your CLI and/or new HTML themes and designs for in the browser.
Automatic support in existing and future test runners and orchestrators like Karma and BrowserStack (without them, you, or someone else needing to build and maintain support specifically for your framework).
Improve longevity and impact of your community by making plugins much more widely usable.

Are we there yet?

Update on test frameworks (/cc @js-reporters/intern, @js-reporters/jasmine, @js-reporters/mocha, @js-reporters/qunit, @js-reporters/tape, @js-reporters/vows).

By now, all the frameworks I checked provide TAP streams, or have seemingly good plugins for it. So, on the framework side, we're mostly there already.

QUnit: Its CLI uses TAP by default, and supports --reporter for other reporters that accept an EventEmitter with QUnit-specific and CRI (working draft) events. The same reporter API is available in the browser, although its TAP reporter is currently missing in the browser build, so perhaps it could provide it there as well going forward.
Mocha: Its CLI bundles a TAP reporter, and also supports --reporter for other reporters that accept an EventEmitter with Mocha-specific events. In the browser, the TAP reporter is available as well, printed to console.log.
Jasmine: Its CLI takes --reporter for reporters that accepts its EventEmitter. In the browser, jasmine.addReporter() provides that same API. A plug-in is available for TAP output, supported and tested both in Node.js and the browser, printed to console.log.
tape: Its CLI and browser use TAP by default (naturally), printed to console.log. It offers the createStream() method for custom reporters, both from a web page and from Node process. Its ecosystem generally embraces the Linux philosophy of pipes where each reporter is its own process, with its own TAP parser, reading from stdin. These reporters are already highly portable, and e.g. usable atop output from QUnit and Mocha.
node-tap: Its Node API uses TAP by default (naturally), printed to process.stdout. I found some discussion and past change logs indicating that it worked via browserify, but I'm unsure of its current status.

And update on test runners and orchestrators (/cc @js-reporters/integration, @js-reporters/node-tap, @js-reporters/intern):

Karma runner: Its internal protocol requires a framework adapter (of which there are many). Someone did build a generic karma-tap adapter, however. In theory that should allow one to e.g. use Karma without a QUnit or Mocha plugin, and let karma-tap handle it generically.
Intern: Its rich plugin interface offers multiple test and assertion interfaces and reporters. While one could use this to implement adapters to other testing frameworks, I understand it mainly recommends its own interface. I was not able to find a TAP interface as built-in interface or as plugin. There is some past interest (ref https://github.com/theintern/intern/issues/39, https://github.com/theintern/intern/issues/689).
node-tap: While also known as a framework, its CLI is a generic test runner for Node.js. So long as each test file is executable that e.g. require()'s a framework and outputs TAP, this CLI offers parallel testing, code coverage, filtering, etc. It aggregates TAP streams and supports --reporter .
browserstack-runner: This uses the js-reporters package to adapt framework APIs (QUnit, Mocha, Jasmine) to the CRI EventEmitter (working draft). It currently does not support TAP.

What about the old CRI draft?

See: Common Reporter Interface (Working Draft).

Having worked now on almost all layers of the stack (unit and integration framework, orchestrators, browser launchers, CLI runners, cloud services, reporting plugins, etc), I think our previous programmatic and EventEmitter-based standardisation is problematic in a number of ways:

Hard to inject in browsers. I think there's great value in having isomorphic JavaScript libraries actually test their code on both Node.js and in multiple real browsers. This is now easier than ever with Firefox and Chromium having built-in headless modes and DevTools/RemoteDebug protocols [1], [2], and beyond that through WebDriver (for BrowserStack and SauceLabs). However, sharing your test suite between Node, an automated browser, and manual debugging can be though when orchestrators need to inject their client into the web page. There are generally one of three approaches I see:

The orchestrator tries to inject it for you by proxying and rewriting your HTML (e.g. browserstack-runner injects code before </body> and assume tests will not start before window.onload).
The orchestrator requires that you don't load or start the framework yourself, and instead use an alternate version of the framework that includes the client and controls test start (e.g. Karma adapters tend to work this way, where they maintain a patched version of the framework, and require that your HTML either doesn't load the framework or that you maintain a separate copy of the list of sources and tests for it and let Karma build the HTML for you; example).
The orchestrator requires you to insert it in your HTML in the right place (e.g. TestSwarm still works this way, example)

Limited to JavaScript. JavaScript is big, but TAP is no doubt bigger in scope. I don't particularly foresee mass adoption of by JS developers to pipe results into reporters written using other runtimes, but then again we do have esbuild and rslint seeing great attention. While JS developers may be less likely to want to "figure out" installing the right version of Python or PHP and interact with dependencies in those ecosystems etc., there is a lot of room for Go and Rust binaries and APT/Homebrew package managers distributing tools that "just work". Once you open this door, there is also room for passive adoption in other environments, such as a native IDE on your platform that can run tests for you in an integrated shell, and aggregate or do other useful things with TAP output if it detects it.

Extended metadata. There are three capabilities that appear to be widely used in testing frameworks, and were part of the CRI working draft, that are currently not natively recognised in TAP 13.

Child-parent relationships for tests (ref https://github.com/TestAnything/testanything.github.io/pull/36). Our own TAP reporter in js-reporters simply flattened these into the test name with a delimiter, e.g. "Parent > sub test". Other frameworks like node-tap use indentation and comments to communicate their structure in backwards-compatible fashions. We don't need to agree here how to handle this as this is backwards-compatible and imho non-essential.
Test runtime duration. The CRI spec and adapters implement this. Our TAP reporter currently ignores it. There's numerous ways that other reporters place this in a comment after the test name, which seems fine. (ref https://github.com/TestAnything/Specification/issues/16).
Test start signal. The CRI spec and adapters implement this. Our TAP reporter currently ignores it. I've not seen a common way to do this in TAP. I suppose a # START <NR. test name> comment could certainly work. This is mainly used today in real-time HTML and IDE consumers to show which test(s) are currently executing, which can help improve the debugging experience as well as to get a feel of where much time is spent or where an otherwise complex failure may have occurred.

FAQ

Isn't console.log fragile?

I've long held that TAP was insufficient for our needs, that we need a standardised way for reporters and orchestrators to discover and access the event stream in-process (especially in browsers). It felt wrong to claim console.log completely for ourselves, or to have to monkey-patch it when you need to consume it client-side in a browser. Over the past year, I've realised (finally) that I think this lets perfect be the enemy of good. "Just" consuming the console output from a browser massively simplifies the problem we are trying to solve, and at least I'm convinced it's worth the small sacrifice.

Conceptually, I think it's fair that an application does not send data to the console when it is under test and not asked or instructed to respond to someone in stdout or console. The one use case I've come across where source code does regularly use the console is for deprecation warnings and other error messages. These may or may not be useful or intentionally enabled in CI, but I think it's not our place to disrupt or disallow that. But, we also don't need to, since those go to a different channel than console.log. If a legacy code base finds unwanted messages are sent to the main log, I suppose they could as a quick initial mitigation resort to something likeconsole.log = function (){} in their test bootstrap (after importing the test framework and/or TAP reporter), or even console.log = console.info to preserve but it in a different channel.

This means WebDriver and DevTools/RemoteDebug-based runners don't have to inject anything, and can simply use native and pre-existing means to consume the browser's console log.

Prior art: Testling by @substack (ref https://github.com/substack/testling/issues/93), karma-tap by @bySabi, tap-console-parser by @Jam3, and more.

Next steps

Are we missing anything? Do you think this is feasible? Let us know!

If you use projects that don't native support TAP (like Karma, WebDriver, Grunt to launch Headless Firefox or Chromium via Puppeteer, etc.), consider giving it a try to use TAP to format the intermediary communication and let us know how this works out for you. Or if you do this already, how has the experience been? Have you run into issues or would like additional support from upstreams?

ljharb commented 3 years ago

How does jest fit into this?

Either way, i think that supporting TAP as the primary output format has always been the only viable choice, so i think this sounds like a good direction.

boneskull commented 3 years ago

ship it

boneskull commented 3 years ago

A thing I don't understand is the bit about console.log. does TAP specify it must be written to STDOUT? if someone is concerned about this, why not use a different stream?

an end-user workaround for "console.log-style" debugging (that'd otherwise malign TAP output) is just to use console.error instead, yes?

ljharb commented 3 years ago

That's what i always do anyways, since stdout output is often being piped to other tools to consume.

qunitjs / js-reporters