test assertions inside swingset vats

What is the Problem Being Solved?

For components like Zoe, most (60% by weight) of the testing is done with unit tests, that load libaries, execute functions, and assert the results match what we expect. For these, we use assertions like t.equal(got, expected, description) from tape or an equivalent. These tests run under SES, so they can use harden, although they run within the Start Compartment so the code isn't as confined as it will be when running for real on the chain. Both the code under test, and the tests/assertions themselves can use E() (eventual-send) too, however all the targets are local (there are no Presence objects, only actual "Near" objects and Promises for them).

The other 30% of our test bytes are written to use swingset, to exercise the difference between local eventual-sends and remote ("Far") ones. These tests have an "external driver" (test-zoe.js) which configures a swingset with a couple of vats (one of which is the same vat-zoe that exists on the chain, one is the "internal driver", and a couple more that represent users or clients), start everything running, and wait for it to complete (there is no IO in this swingset, so everything must be set in motion by the initial bootstrap message).

When the machine winds down, the external driver function then compares the kernel's testLog against a "golden master" of expected messages. A typical example looks like this:

https://github.com/Agoric/agoric-sdk/blob/ef4be52bd442a4f6d768ae8b0edb63ad1d916cc9/packages/zoe/test/swingsetTests/zoe/test-zoe.js#L34-L65

This testLog is a special authority that all vats currently receive as vatPowers.testLog, as a function that accepts a single string argument. This string is appended to an array inside the kernel, and the array can be read through a debugging interface on the controller named c.dump().log. The testlog dates back to the beginning of swingset, when there was no other form of IO, and I needed something to assert that vats were running at all.

The downsides of this primitive approach are:

the "internal driver" (test code inside the vat) is limited to printing messages about it's success: we don't have much of an assertion library to work with in there
exceptions within the vat typically yield opaque errors that are annoying to track down
the golden-comparison only gets to see the completed result: there's no way to pause the process and examine the results in between steps: e.g. we'd like to assert that the auction doesn't complete until after the last bid is placed, by checking the log both before and after that bid, but this approach has no concept of time or sequencing

When we started, we couldn't do much better than that. But these days, we have a few more tools to work with:

we have a c.queueToExport that lets us send new messages (not just the initial bootstrap message) into vats from the outside (i.e. the test program in test-zoe.js). This is still primitive, and it is awkward to send messages to anything but the root object of a vat, or to include object/promise references in the arguments, but it can be used to trigger a pre-planned set of operations like "submit second bid"
@FUDCo 's "Results" work (#1206) give the outside code (the one that calls c.run()) a way to poll for results of messages injected with c.queueToExport

Description of the Design

I'm not yet sure what direction to take, but there are some pieces I've got in mind:

we could add a test assertion library into the "internal driver" vat, and have it emit TAP-style ok 1 / not ok 2 messages
the external driver (test-zoe.js) could drive multiple phases of the test by using queueToExport to initiate each one
the external driver might take the TAP messages from the vat and basically forward them outwards to whatever test-runner (maybe tap) that executed it
we use the Result of each queueToExport to look for exceptions and flunk the test if any happened

Maybe we do queueToExport to start a phase, and expect the result promise to resolve to an array of TAP ok/not-ok results. The external driver expects the result to resolve by the time the c.run() finishes, then walks through the result array and just does t.ok or t.notOk on each one, copying any description. If our inner-driver assertion calls are diligent about adding descriptions to each t.equal call, then a failure will at least give us a string to grep for.

Some assertion libraries do extra work to get interesting stack traces out of assertion failures. These tricks may or may not work within a vat (and if they work now, while we've disabled Error taming, they might stop working once we tighten that up to prevent ambient communication channels). Likewise, if an exception is thrown within the code-under-test (which, given our pervasive use of precondition/assert/insist checks, is the most likely way for a test failure to express itself), it'd be awfully nice to get good stack trace data out of it, which might be thwarted by SES.

We might consider adding a special testing mode to swingset, which could add a t-like test-assertion object to vatPowers (but only when being run under unit tests). The internal driver would then look something like:

export function buildRootObject(vatPowers) {
  const { t } = vatPowers;
  return harden({
    async phase1() {
      const result1 = await E(foo).bar();
      t.equal(result, blah, 'description');
      const result2 = await morestuff;
      t.throws(() => await E(foo).bazshouldfail(), 'description');
    },
    async phase2() { morestuff },
  });
}

The await would be important to make sure that any rejected messages turn into a rejection of the phase1 result promise, rather than getting dropped on the floor. We could do it with a .then chain, but we'd need to be diligent about never sending a message without checking the result for rejections (maybe we'd use a pattern where all promises are gathered into a big array and we return Promise.all() at the end).

The external driver would then do:

test('description', async t => {
  const c = buildVatController(etc, extraVatPowers: { t } );
  await c.run();
  t.notEqual(c.bootstrapResult.status(), 'rejected');

  const r1 = c.queueToExport(stuff that points at the internal driver, 'phase1', empty args);
  await c.run();
  t.notEqual(r1.status(), 'rejected');
  t.end();
});

Agoric / agoric-sdk

test assertions inside swingset vats #1390