Counter for expected assertions when using `step(..)` / `verifySteps(..)` is surprising (and undocumented!)

getify commented 6 years ago

Tell us about your runtime:

QUnit version: 2.4.0
What environment are you running QUnit in? (e.g., browser, Node): Node 8.6.0, Chrome 61.
How are you running QUnit? (e.g., script, testem, Grunt): custom npm script

What are you trying to do?

I'm trying to use the step(..) / verifySteps(..) API, but I had a failure related to the number of expected assertions.

The documentation for this feature doesn't mention the impact on expected assertion count at all. So at a minimum, the docs need to be updated. But I also think the current behavior is counter-intuitive.

Code that reproduces the problem:

QUnit.test( "verify steps", function test(assert){
    assert.step( "do stuff 1" );
    assert.step( "do stuff 2" );

    assert.expect( 1 );  // or 2
    assert.verifySteps( ["do stuff 1","do stuff 2"] );
} );

// test fails:
//   Expected 1 assertions, but 3 were run

What did you expect to happen?

I assumed the number of assertions to expect would either correlate to the number of step(..) calls (2), OR to the number of verifySteps(..) calls (1).

What actually happened?

The failure error message says 3, not 1 or 2. So clearly the counter is incrementing with both the step(..) calls and the verifySteps(..) calls.

This feels very strange and surprising to me. Even if it had been documented that way, I think it leads to more confusion with test authoring. It should consider only one or the other, not both.

The argument for using only the step(..) calls in the counter:

step(..) is kinda like an ok( true, .. ) call, so each time step(..) happens, make sure it's counted. If you know there are 5 steps to some algorithm, it makes intuitive sense to increase your expected count by 5.

Moreover, it doesn't make sense to include verifySteps(..) in this count in the same way that the call to assert.expect(..) doesn't itself get included in the count.
The argument for using only the verifySteps(..) calls in the counter:

step(..) is conceptually like just pushing an entry into an array. We haven't verified anything yet. There's no true or false passing that's happening at that point. The assertion doesn't happen until you call the verifySteps(..) call, which is conceptually like calling deepEqual(..) on the array that step(..) is pushing into.

Usually you make lots of step(..) calls, but only one verifySteps(..) call. So the counter should only increment once with that call, regardless of how many step(..)s there are.

Moreover, you're already implicitly counting the number of step(..) calls that happened, because the only way entires get into the internal array you're comparing to is by calling step(..), so the verifySteps(..) is already checking that the number -- not just the order! -- of step(..) calls is correct. No need for that to be included in the count.

I think either of these lines of argument is compelling. Personally, I think (2) is how my brain works. The style of how I lay out my tests, my assertions are all collected together at the end, so I expect to be able to see the same number of assertions listed, line-by-line, as what I pass to expect(..). If step(..) is included in that count, I have to look all over other parts of the test code to verify that my number matches. This is clunky.

trentmwillis commented 6 years ago

Thanks for the detailed explanation! When we originally implemented this feature, I didn't think much about the impact on expected assertion counts, so we just went with the default which is to increment the internal counter for any assertion calls.

I agree that the reasoning in (2) makes sense, however, since this would now be a breaking change we will have to wait for QUnit 3.0 to be released before making the change. In the interim, we will add a note the documentation that calls out the behavior.

@qunitjs/qunit-team if any of you all have thoughts on how we should count these assertions, please chime in 😄

Krinkle commented 6 years ago

I've been thinking about what API change could help here. The wider problem, I think, is that when an Assert method internally uses (multiple) other assertions, the count can be confusing. I too agree that approach (2) seems preferable here. Perhaps an (invisible) abstraction over the Assert class would help? A proxy to differentiate between direct calls on the assert parameter, and internal calls. This subclass would increment the counter for certain methods and then forward the call. Once inside, subsequent calls bypass the counter.

However, for the particular issue with step() and expect(), my first feeling is that we should discourage their use. When it comes to unit testing, I find myself taking an unusually conservative stance. It isn't a place I enjoy abstractions or utilities. It's a place for simple and strict code. Nothing about the outcome of assertion should be hidden. Somewhat off-topic, but I wrote about methods like assert.ok, assert.regexp and asssert.notEqual and their problems. As for assert.step(), I love the strictness it symbolises. For example, consider the following:

QUnit.test('bad example', function (assert) {
  var obj = new Example();
  obj.here(function (val) {
    assert.ok(val);
  });
  obj.there(function (val) {
    assert.ok(val);
  });
});

We don't even check in which order things happen., how often they happen, what value is received. We don't even know if one of them is skipped entirely because the assertion is not under our control anymore, but under the subject's control.

expect()

Firstly, I'd say assertions never belong in a nested callback^[1]. Moving assertions out avoids assertions being skipped and naturally produces an assertion failure if the assignment didn't happen.

In addition, not having nested assertions also makes assert.expect() obsolete. I've never liked declaring an expectCount, but until recently I did it and enforced it via config.requireExpects, because it felt necessary. QUnit didn't have very good state tracking back when I first used it. The assertion methods were global and context-less. Failure could easily happen undetected. The expectCount was a quick hack to ensure some confidence in the test results.

If you keep assertions to the top-level, the only way an assertion can be skipped is by an early abort, which QUnit has always caught.

QUnit.test('example', function (assert) {
  var x = false, y = false, obj = new Example();
  obj.here(function (val) {
    x = val;
  });
  obj.there(function (val) {
    y = val;
  });
  assert.strictEqual('x', x);
  assert.strictEqual('y', y);
});

This still doesn't assert the order or frequency of events. One could use assert.expect() to solve the frequency issue.

step()

A more elegant solution than expect() would be to use an array and perform a deep comparison. That also solves the order issue.

QUnit.test('example', function (assert) {
  var log = [];
  var obj = new Example();
  obj.here(function (val) {
    log.push({ here: val });
  });
  obj.there(function (val) {
    log.push({ there: val });
  });
  assert.deepEqual(
    [ { here: 'x' }, { there: 'y' } ],
    log
  );
});

I absolutely love this pattern and can't recommend it enough. It's simple and yet very strict. And is in fact what assert.step() and assert.verifySteps() do internally. However, I don't see much added value in abstracting this via a built-in method. And coincidentally, avoiding them avoids the problems detailed in this issue.

I do acknowledge the wider problem of Assert methods making other assertions, and that definitely calls for a solution. But, while we might find other use cases for such abstraction, it seems right now the only use case is assert.expect().

_{[1]: Except in a promise callback, which can be verified by QUnit if returned from the test function. One can also use async-await instead in modern code which avoids nested assertions even for async code.}

getify commented 6 years ago

FWIW, I have never used expect(..) along with nested (conditional or async) assertions to verify stuff happened. Every now and then an assertion has to be buried, but that's rare. Similarly, I wouldn't use async(number) with multiple done() calls. I always prefer to do my assertions at the top level if I can manage it, and those should hopefully be enough to verify if stuff happened or not.

Now that I've found the step(..) / verifySteps(..) approach, it seems much more appropriate for that kind of verification. I'll say though that this is because I rarely have the kind of racy testing where I'm verifying only that all occur but the order could be indeterminate. If I had that kind of testing, clearly nested assertions (async(number) + done()s) would be preferable to step(..).

But as to the bigger point of if expect(..) should be discouraged, I only use it as a verification (and yes it is a maintenance hazard) against making sure I didn't either forget to add an intended test, or accidentally added/removed one that I didn't account for in the expect. When editing a test, sometimes I update the expect first, then add the assertion(s), sometimes the reverse. But in all cases, it's like a checksum on my intention vs action.

getify commented 6 years ago

BTW, in case it's useful context, here's my general pattern for my asynchronous tests:

QUnit.test( "something", async function test(assert){
    var done = assert.async();

    var somethingExpected = whatever;

    try {
        var somethingActual = whateverAsyncPromise();
        somethingActual = await somethingActual;
    }
    catch (err) {
        assert.expect(1);
        assert.pushResult({ result: false, message: (err.stack ? err.stack : err.toString()) });
        done();
        return;
    }

    assert.expect(1);
    assert.strictEqual(somethingActual,somethingExpected,"hello!");

    done();
} );

BTW, I dunno if I was doing something wrong, but I added the try..catch handling bit to my pattern because it didn't appear that QUnit was smart enough to capture the promise from the test's async function and recognized if it was rejected because of an exception... those got swallowed.

If that's supposed to work, it wasn't for me. If that's not supposed to work, I think that could be a useful feature to consider. :)

Krinkle commented 6 years ago

@getify Yep, I would also expect that to work. The following should do it:

QUnit.test("something", async function (assert) {
  var something = await whateverAsyncPromise();

  assert.strictEqual(something, "whatever", "hello!");
});

I haven't confirmed it myself. But we already support explicitly returning a Thenable from the test callback. Using an async function should behave the same way. It's even documented at https://api.qunitjs.com/QUnit/test.

QUnit will wait for the returned promise (async function) to settle. It will also assert that it gets resolved. It will detect a rejection (async exception) and reports its details. I would assume no async done() or try/catch/pushResult is needed.

getify commented 6 years ago

@Krinkle I think I may have found a bug related to that, will file separately in case. :)

Filed: #1229

getify commented 6 years ago

Pleased to report that by removing the assert.async() parts, my test structure is now simpler:

QUnit.test( "something", async function test(assert){
    var somethingExpected = whatever;

    var somethingActual = whateverAsyncPromise();
    somethingActual = await somethingActual;

    assert.expect(1);
    assert.strictEqual(somethingActual,somethingExpected,"hello!");
} );

Thanks for the suggestion, @Krinkle!

Krinkle commented 2 years ago

I agree of the two directions, option 2 seems preferred. The selling point for me personally is that it means you could use assert.expect() to ensure verifySteps() is in fact called. Which requires it to be included in the count. And ergonomically, including both is inconvenient, especially when expect() is also in-use.

I've deprioritised this so far because the need for expect() more generally is imho relatively low nowadays with a set of stable patterns to follow and with built-in ability to detect leaked asssertions and such; and because it's a hard compat break that isn't possible to perform with our usual strategy of "add/deprecate in N.X and remove in N+1.0".

getify commented 2 years ago

I've authored many hundreds of test cases (with step(..) and verifySteps(..)) over the past 4+ years since reporting this. It was annoying/confusing to get used to the counting, but I eventually adjusted to it. For maintenance purposes, I always include a code comment that explains where the count comes from, like this:

assert.expect( 6 ); // note: 1 assertions + 5 `step(..)` calls

So, with the code comment, it's clear any time I read the tests where the count is coming from. It's an additional maintenance burden and I occasionally forget to update the code comment, but... it works.

I still think this counting strategy could be changed some day, but I've managed fine without the change, so far.

qunitjs / qunit