web-platform-tests / rfcs

web-platform-tests RFCs
75 stars 63 forks source link

RFC: setup({single_test: true}) opt-in #28

Closed foolip closed 4 years ago

foolip commented 4 years ago

Rendered

foolip commented 4 years ago

Having written this up, I think the value of single-page tests is really in simplifying async tests. For sync tests the difference between assert_stuff(); done(); and test(() => { assert_stuff(); }) is minimal.

However, promise_test still sometimes requires step_func wrapping, even when there's just a single test, and using single_test() is still pretty ugly:

single_test();
(async () => {
  doSomething();
  await somePromise;
})().then(done);

Question: is there another low-boilerplate approach that would also help simplify promise tests?

jgraham commented 4 years ago

I don't think this addresses the previous concern: adding extra boilerplate to a mode explicitly intended to reduce boilerplate to a minimum undermines the point of the feature.

It's sort of unclear to me what the real problem is here. If a test is throwing at startup it doesn't really matter much if it's considered a single page test or not; there's an error that needs to be fixed in the browser or in the test.

zcorpan commented 4 years ago

A few thoughts:

Separately, the name of the opt-in seems prone to get wrong (the RFC has 3 different spellings). I'm open for suggestions, but to reduce confusion we should be consistent between the API, the documentation name of feature, and maybe also the implementation in testharness.js.

foolip commented 4 years ago

Thanks @sideshowbarker and @zcorpan. I can't sleep planes, so I prepared this instead, and so I was sort of a zombie towards the end :)

foolip commented 4 years ago

@jgraham I'm raising this again now because accidental single-page tests come up quite often when comparing cross-browser results on wpt.fyi, disproportionately affecting Safari results where the accidental tests far outnumber the real ones. I'd like these to be harness errors for clarity and ease of triage/filtering.

At root the problem is: to avoid boilerplate this mode has to be implicit, but an early uncaught exception is then ambiguous: is it a single-page test that's made simpler by assuming APIs are available, or a regular testharness.js test that never got to defining its first test?

I was hopeful that https://github.com/web-platform-tests/wpt/pull/11024 or similar would be enough, but working through the alternatives that leaves some warts, making single-page tests a bit worse IMHO.

In making tradeoffs here, it's worth noting that there are only ~130 of these tests, but per a script there are over ~9200 tests overall that have a single subtest. By and large people are defining tests explicitly, so reducing surprises in that mode would be very good.

foolip commented 4 years ago

After discussing at TPAC today the new proposal is to use a setup() property. The names I can think of are setup({file_is_test: true}) (misnomer for .any.js) or perhaps setup({single_test: true}).

I suggest that the behavior of setup({file_is_test: true}) should be:

@jgraham does that sound right?

foolip commented 4 years ago

@jugglinmike I'd much appreciate your scrutiny on this. I feel like you'd have opinions about how this works and what API to control it makes the most sense.

foolip commented 4 years ago

I believe that we intend for uncaught exceptions/unhandled rejections to trigger a harness error when this mode is enabled

No, in that regard I'd like it to behave as it currently does, and cause a failing subtest. It should be as if you've created an async_test with everything correctly wrapped in t.step_func. When test steps throw, that's a failure, not an error. I think it's important to be able to deliberate fail this way, otherwise one has to add redundant asserts to get the intended failure mode.

foolip commented 4 years ago

If anyone has a preference on what the setting property name should be I could send a PR showing exactly what I intend and which tests are affected.

zcorpan commented 4 years ago

Using setup() seems like a good idea.

I think single_test is a good name for the thing. It's a single test mode, as opposed to the default mode which allows multiple subtests.

Single page test is not entirely accurate, as one can use multiple pages (with popups or otherwise) in this mode.

File is test, as you said, seems confusing for .any.js files.

foolip commented 4 years ago

Alright, setup({single_test: true}) sounds good!

foolip commented 4 years ago

Alright, @web-platform-tests/wpt-core-team, I think this is ready for review again. I've updated the name of the opt-in, and added the step-by-step plan for rolling this out in the beginning of the details section.

foolip commented 4 years ago

To be clear, the end state is that using (a failing) assert_* function without the setup() call would be a harness error?

Actually, the result would be a harness error with or without the setup() wrapping, because I intend to leave both of these unchanged:

Exception in setup: https://github.com/web-platform-tests/wpt/blob/cd4b6daa3ca6aa2afa7cbeb55328482499b4f55e/resources/testharness.js#L2193-L2201

Uncaught error elsewhere: https://github.com/web-platform-tests/wpt/blob/cd4b6daa3ca6aa2afa7cbeb55328482499b4f55e/resources/testharness.js#L3383-L3387

(How about passing assert_* calls?)

I intend to remove this as one of the final steps: https://github.com/web-platform-tests/wpt/blob/cd4b6daa3ca6aa2afa7cbeb55328482499b4f55e/resources/testharness.js#L3079-L3081

Thus a passing assert would be fine. But an argument could be made for throwing an error when calling assert_*() or done() in a situation where it must be a mistake. I don't have a strong opinion and am open to suggestions.

jgraham commented 4 years ago

But an argument could be made for throwing an error when calling assert_*() or done() in a situation where it must be a mistake

Yes, we should definitely do that. If the design is that you have to opt in, not opting in should be a clear error in all cases.

foolip commented 4 years ago

Alright, I've added "Make it an error to call done() or make any assertion before any test has been defined" as the second-to-last step. When we get to that point there might be some tinkering with the exact criteria for erroring out, but erroring in the same conditions that now trigger a single-page test is the minimum I think.

foolip commented 4 years ago

I will consider this accepted on September 30 unless there's additional feedback, but @jgraham I would appreciate your explicit go/no-go still.

jgraham commented 4 years ago

I'm happy to proceed with this as specified. I also think we should plan to revisit this in, say, a year to decide if we are happy with the setup or if we want to try another approach, or just remove the feature entirely.

foolip commented 4 years ago

Yeah I agree with revisiting how it went at next TPAC. Outright removal seems like an OK outcome, although I think not super likely because some of the async tests using this really are simpler because of it.

foolip commented 4 years ago

Alright, it's Sep 30. I'll go ahead and merge this, and most likely @zcorpan and/or I will implement it.