qunitjs / js-reporters

šŸ“‹ Common Reporter Interface (CRI) for JavaScript testing frameworks.
MIT License
60 stars 18 forks source link

RFC: Spinal Tap (alternative proposal) #133

Open Krinkle opened 3 years ago

Krinkle commented 3 years ago

Hi everyone,

Don't worry if you haven't kept up with recent developments here or don't remember how js-reporters works. I'm proposing a new direction, and everything you need to know is right here. What follows is a progress update on how everyone is doing, and what we can do next.

What is this?

It's a proposal for adopting TAP, a text format for reporting progress and results of software tests. Unlike the current iteration of js-reporters, TAP is a mature and already widely-adopted protocol.

Avoid this: Do this:

Remind me, why?

For test runners and browser orchestrators :

For test frameworks:

Are we there yet?

Update on test frameworks (/cc @js-reporters/intern, @js-reporters/jasmine, @js-reporters/mocha, @js-reporters/qunit, @js-reporters/tape, @js-reporters/vows).

By now, all the frameworks I checked provide TAP streams, or have seemingly good plugins for it. So, on the framework side, we're mostly there already.

And update on test runners and orchestrators (/cc @js-reporters/integration, @js-reporters/node-tap, @js-reporters/intern):

What about the old CRI draft?

See: Common Reporter Interface (Working Draft).

Having worked now on almost all layers of the stack (unit and integration framework, orchestrators, browser launchers, CLI runners, cloud services, reporting plugins, etc), I think our previous programmatic and EventEmitter-based standardisation is problematic in a number of ways:

Hard to inject in browsers. I think there's great value in having isomorphic JavaScript libraries actually test their code on both Node.js and in multiple real browsers. This is now easier than ever with Firefox and Chromium having built-in headless modes and DevTools/RemoteDebug protocols [1], [2], and beyond that through WebDriver (for BrowserStack and SauceLabs). However, sharing your test suite between Node, an automated browser, and manual debugging can be though when orchestrators need to inject their client into the web page. There are generally one of three approaches I see:

  1. The orchestrator tries to inject it for you by proxying and rewriting your HTML (e.g. browserstack-runner injects code before </body> and assume tests will not start before window.onload).
  2. The orchestrator requires that you don't load or start the framework yourself, and instead use an alternate version of the framework that includes the client and controls test start (e.g. Karma adapters tend to work this way, where they maintain a patched version of the framework, and require that your HTML either doesn't load the framework or that you maintain a separate copy of the list of sources and tests for it and let Karma build the HTML for you; example).
  3. The orchestrator requires you to insert it in your HTML in the right place (e.g. TestSwarm still works this way, example)

Limited to JavaScript. JavaScript is big, but TAP is no doubt bigger in scope. I don't particularly foresee mass adoption of by JS developers to pipe results into reporters written using other runtimes, but then again we do have esbuild and rslint seeing great attention. While JS developers may be less likely to want to "figure out" installing the right version of Python or PHP and interact with dependencies in those ecosystems etc., there is a lot of room for Go and Rust binaries and APT/Homebrew package managers distributing tools that "just work". Once you open this door, there is also room for passive adoption in other environments, such as a native IDE on your platform that can run tests for you in an integrated shell, and aggregate or do other useful things with TAP output if it detects it.

Extended metadata. There are three capabilities that appear to be widely used in testing frameworks, and were part of the CRI working draft, that are currently not natively recognised in TAP 13.

  1. Child-parent relationships for tests (ref https://github.com/TestAnything/testanything.github.io/pull/36). Our own TAP reporter in js-reporters simply flattened these into the test name with a delimiter, e.g. "Parent > sub test". Other frameworks like node-tap use indentation and comments to communicate their structure in backwards-compatible fashions. We don't need to agree here how to handle this as this is backwards-compatible and imho non-essential.
  2. Test runtime duration. The CRI spec and adapters implement this. Our TAP reporter currently ignores it. There's numerous ways that other reporters place this in a comment after the test name, which seems fine. (ref https://github.com/TestAnything/Specification/issues/16).
  3. Test start signal. The CRI spec and adapters implement this. Our TAP reporter currently ignores it. I've not seen a common way to do this in TAP. I suppose a # START <NR. test name> comment could certainly work. This is mainly used today in real-time HTML and IDE consumers to show which test(s) are currently executing, which can help improve the debugging experience as well as to get a feel of where much time is spent or where an otherwise complex failure may have occurred.

FAQ

Isn't console.log fragile?

I've long held that TAP was insufficient for our needs, that we need a standardised way for reporters and orchestrators to discover and access the event stream in-process (especially in browsers). It felt wrong to claim console.log completely for ourselves, or to have to monkey-patch it when you need to consume it client-side in a browser. Over the past year, I've realised (finally) that I think this lets perfect be the enemy of good. "Just" consuming the console output from a browser massively simplifies the problem we are trying to solve, and at least I'm convinced it's worth the small sacrifice.

Conceptually, I think it's fair that an application does not send data to the console when it is under test and not asked or instructed to respond to someone in stdout or console. The one use case I've come across where source code does regularly use the console is for deprecation warnings and other error messages. These may or may not be useful or intentionally enabled in CI, but I think it's not our place to disrupt or disallow that. But, we also don't need to, since those go to a different channel than console.log. If a legacy code base finds unwanted messages are sent to the main log, I suppose they could as a quick initial mitigation resort to something likeconsole.log = function (){} in their test bootstrap (after importing the test framework and/or TAP reporter), or even console.log = console.info to preserve but it in a different channel.

This means WebDriver and DevTools/RemoteDebug-based runners don't have to inject anything, and can simply use native and pre-existing means to consume the browser's console log.

Prior art: Testling by @substack (ref https://github.com/substack/testling/issues/93), karma-tap by @bySabi, tap-console-parser by @Jam3, and more.

Next steps

Are we missing anything? Do you think this is feasible? Let us know!

If you use projects that don't native support TAP (like Karma, WebDriver, Grunt to launch Headless Firefox or Chromium via Puppeteer, etc.), consider giving it a try to use TAP to format the intermediary communication and let us know how this works out for you. Or if you do this already, how has the experience been? Have you run into issues or would like additional support from upstreams?

ljharb commented 3 years ago

How does jest fit into this?

Either way, i think that supporting TAP as the primary output format has always been the only viable choice, so i think this sounds like a good direction.

boneskull commented 3 years ago

ship it

boneskull commented 3 years ago

A thing I don't understand is the bit about console.log. does TAP specify it must be written to STDOUT? if someone is concerned about this, why not use a different stream?

an end-user workaround for "console.log-style" debugging (that'd otherwise malign TAP output) is just to use console.error instead, yes?

ljharb commented 3 years ago

That's what i always do anyways, since stdout output is often being piped to other tools to consume.