Open karlhorky opened 9 months ago
Why not use a test library instead then?
I think all the things you could definitely get them to agree on they already agree on.
It's the plugins and extensions that they might disagree on.
And you can use something like vitest.
While it's cool to have the runtime also have a test runner, it's not exactly in the scope of a runtime, is it? 🤔 Should package management also be prescribed?
If someone wanted to try something like this as a package, it is more than welcome to be part of unjs.io (cross linked to https://github.com/unjs/community/discussions/2) ❤️
Just reminded i got untest pkg with this idea 2 years ago and forgot to even push my progress since Vitest was born and it was amazing! I think main limitation is that we need to split the standard assertion library from runtime logic to make this happen.
Since the ecosystem hasn’t ever come close to cohering on a standard pattern in libraries, i don’t think it makes sense to standardize one. Platforms choosing to offer a test framework despite this lack of coherence shouldn’t justify continued artificial forcing of patterns.
@ljharb Are you aware of any previous efforts that particular runtimes (Node, Bun and Deno) explicitly disagreed with a unified API for testing?
It is not an artificial need that today we cannot test a library against all WinterCG runtimes. (and it is growing, llrt made one for themselves, based on node-assert and jest-like runtime).
Also looking at the readme:
How are the APIs selected By looking at the APIs that are already implemented and supported in common across Node.js, Deno, and Cloudflare Workers. If there were at least two implementations among those -- either already supported or in progress -- then it was added to the list.
I think this at least met the 2/3 criteria of this repo (or i might be totally wrong)
@pi0 the issue isn’t disagreement, it’s artificially forcing into a standard something that never went through rigorous intentional design nor got extensive userland usage prior to implementation.
I always imagined API Standards are specifically being designed to be strict and purposed to be consistent and also imagined that WinterCG is a place to discuss about designing such (common) APIs to be consistent.
Would it still be something (in vision of @wintercg group) to be continued perhaps as a new proposal (not common-api but like proposal-testing-api
)?
I don't think the test
framework should be standardized, but I do think the lower-level machinery around it probably should be. Like a core assert(...)
interface which could be wrapped to make the prettier variations like expect(...)
with its chainability. I also think we should probably standardize a sandbox at the single test unit level and allow frameworks to compose and rearrange them as they see fit. Should probably also have a standardized functional output for that single unit of work with an interface that receives some event data about that run but not standardization for text output.
Test frameworks have varying structural opinions, but at the core of all of them is the need for a well-isolated sandbox which can intercept and analyze any failure from within. Standardizing that sandbox could be reasonable.
Whatever we standardize here (if anything), please make sure it's easy to test (failing) async functions and promises (including rejection and timeout).
@Qard that's not how they all work, but it's definitely worth reviewing what someone can come up with - I'd say that lacking ecosystem coherence, any approach that isn't compatible with the top N test frameworks (arbitrarily, 10?) isn't a good idea to move forward with.
I do see the benefit of standardizing the assert/expect behaviors/apis, even if other things like describe/it/test
would be more up for grabs. As long as it leaves freedom for userland extension.
@ljharb Yep, agreed. We might not be able to make a tool that everyone will want to use, but as long as it satisfies most users it should be fine. Might even be several tools--probably would be, I would think assertions should be able to exist on their own from an error-catching execution sandbox.
A similar case is UrlPattern--not every routing framework will want to use that style, but for many it's enough. Putting aside if the design of that particular API is any good or not, I'm just using it as an example of an API solving a specific use case and not trying to be everything for everyone.
From my point of view, what I would like is something along the lines of:
import { describe, it, expect } from "std:test";
Where each runtime could define std:test
however it wants, or even let the developer define what std:test
points to. As @ljharb points out, there's already plenty of alignment on how these things work, it's really just the matter of needing to switch between node:test
, bun:test
, and deno:test
, or other options, that make the dream of writing your JS tests once and running in any runtime a problem.
Even describe/it aren't universal among testing paradigms - that's just BDD, which isn't the only testing paradigm.
Would it help to always export describe
and it
, but in engines that don't support them, they're false
rather than a function?
Problem: Bun and Deno have built-in test runners you have to import from to run tests. I use Mocha. This doesn't work.
As one of the new maintainers of Mocha I added an issue that tracks support for Bun / Deno – as so far no such feature request existed in the Mocha project: https://github.com/mochajs/mocha/issues/5108
I think @isaacs reply to @wesleytodd here is great at describing some of the different approaches that makes standardizing hard: https://blog.izs.me/2023/09/software-testing-assertion-styles/
The one thing that has been fairly successfully standardized is the TAP output: https://testanything.org/ (And similarly SARIF for static analysis output: https://sarifweb.azurewebsites.net/)
The problem that today we cannot test a library against all WinterCG runtimes
seems like an honorable one – but this proposal is for a specific solution to that problem, I would suggest re-framing.
There exists some prior art in testing things across plenty of runtimes, such as https://github.com/bterlson/eshost (with https://github.com/devsnek/esvu / https://github.com/GoogleChromeLabs/jsvu) but that injects its own set of functions into the runtime rather than relying on them being built in.
Another dimension to this:
This is be the first dev-environment oriented WinterCG proposal? Would all runtimes be expected to include this in their production runtimes as well? Separate test frameworks are rarely included in production builds, but I guess Node, Deno and Bun all are including theirs everywhere. How would possibly including a test framework in production affect the serverless oriented runtimes that are trying to be as slim as possible?
Unfortunately it’s not just about how the tests are written, but the different execution model makes it very hard to impossible to have the same tests run the same on multiple runtimes. I think this problem is better tackled in the ecosystem rather than wintercg.
(The need for a cross runtime testing solution is there).
Very well stated @mcollina, i agree - the problem is VERY real, but that doesn't mean a good native solution is possible.
It's very hard for humans to accept that a thing they want isn't something they can have, but I hope we're able to accept that if indeed that's the case here.
I think that a tool can be written to run tests on all runtimes, but it needs to take into account the different execution models, how they handle TS compilation, if bundling is needed, etc.
I wasn't thinking quite to that level. I was more thinking just an execution sandbox with some domains-like error intercepting functionality to capture any errors thrown, rejections left unhandled, etc. Most of what test frameworks do from a user-facing perspective is just structural organization things, which is fairly trivial compared to the much more complicated internal need of effective sandboxing. If the internal complications are mostly handled by some existing standard then it would be fairly easy for people to make their own testing frameworks to follow whatever structure they want on top of that.
Revisiting @nzakas's comment
it's really just the matter of needing to switch between
node:test
,bun:test
, anddeno:test
, or other options,
Is the problem just about the import identifier? Is that something we can solve with import maps?
too bad it’s not just “test” like every other node core module, or it would be easy :-)
I think an exploration of if a runtime agnostic test runner could be made, either tapping into the native ones as the runner core entirely, or as a library that has compatibility with the apis.
This would help identify if/where the different runners meaningfully conflict and which are actually shared
@voxpelli This is the first dev-environment-oriented WinterCG proposal. Would all runtimes be expected to include this in their production runtimes as well? Separate test frameworks are rarely included in production builds, but I guess Node, Deno, and Bun all include theirs everywhere. How would possibly including a test framework in production affect the serverless-oriented runtimes that are trying to be as slim as possible?
Yes, I guess this is the first time we are thinking of something related to the dev side as well here.
At least I'm considering introducing production tests API for nitro. Today, there is also no easy way to cross-runtime test deployment targets locally. I see even a minimal spec that introduce in WinterCG to be usable in both dev/runtime cases. (thanks for write-up and all you are doing btw!)
@mcollina Unfortunately it’s not just about how the tests are written, but the different execution model makes it very hard to impossible to have the same tests run the same on multiple runtimes.
Could we think of WinterCG's scope to define how tests to be written to make them "runnable" in common runtimes? A subset, minimal and optional suggestion RFC that implementations can follow as a commonground recommendation and reduce furthor divergance.
I am not so sure that it is just a question of switching the import specifier. The semantics do actually matter, especially when it concerns asynchronous throws, test parallelism, and the correspondence of output to input.
It would be possible, of course, to fully specify all of these semantics such that browsers and server runtimes could expose a test
import with the mocha interface, but I believe it would be a mistake to just assume that because mocha exists and runtimes provide things with these names already, that such specification is unnecessary.
If we look at language runtimes that have broad consensus about test primitives, then what I called "tap style" (similar to the junit approach in .NET and Java, albeit with xml instead of TAP) has a lot of advantages, since it carves the problem up into more or less discrete smaller problems:
t.plan()
, t.assert()
, etc., or mocha BDD style describe/it methods) can be specified in terms of the protocol elements it generates.From a more meta point of view, I don't think this is (yet) a good fit for WinterCG. There has been ample exploration of the space in userland over the years, and it's clear that there are multiple ways to skin this cat, and that people do not at all agree on the best way to do it. There is no web/browser/language standard on how to do it. So, unlike for example the shape of and behavior of Request/Response objects, where WinterCG can take existing web standards and JS language concepts and adapt them in a straightforward way to server use cases, this would be taking existing serverside JS userland concepts (which in some cases are not semantically equivalent to similar features in browser JavaScript), and blessing them as a "specification".
While the risks are no doubt surmountable, this raises the following concerns:
Since Node.js, Bun and Deno offer different APIs (
node:test
,bun:test
,Deno.test
) but have many similarities, would it make sense to have a set of specced cross-platform testing APIs?Motivating Example
runtime:
prefix, in the style of Node.js or Bun:Or, Deno-style global:
This was originally inspired by @nzakas's tweet here:
Source: https://twitter.com/slicknet/status/1762264774166085937
@CanadaHonk rightly mentions that, while there are similarities, there are probably challenges creating a spec:
Source: https://twitter.com/CanadaHonk/status/1762417370893516929
Alternatives Considered
Separate Package
Instead of something built into the standard library of multiple runtimes, use a separate package (either like Vitest, including testing features of its own, or a wrapper package).
Downsides:
devDependencies