Normative: Make waitAsync return a wrapped value so it can fail fast

syg commented 4 years ago

This is an alternative to #29.

As I was playing around with the API during implementation and as prompted by Rick's async errors question from #28, I'm reopening the discussion from SYNC-RESOLVE, as the current design has a performance footgun.

The idea is instead of making everything async, return a wrapped result object that only has a Promise when the action could not be completed synchronously.

The Point of `Atomics.wait` and Futex

Atomics.wait is basically a futex emulation layer in JS. Futexes are a Linux kernel concept designed for writing userland synchronization data structures, mostly mutexes. The core idea is to provide a low-level building block so mutexes can tell the kernel to do the slow thing of blocking the thread and waiting for the mutex to be unlocked only when it's necessary. IOW, the point is to be fast when there's no contention on the lock.

A usual example, taken from this futex article, looks something like this, converted to JS:

// Hard-coded location for the mutex state.
// sab[ATOM_INDEX] == 0 means unlocked
// sab[ATOM_INDEX] == 1 means locked, no waiters contending
// sab[ATOM_INDEX] == 2 means locked, with waiters in contention
const ATOM_INDEX = 0;

class Mutex {
  constructor(sab) {
    this.sab_ = sab;
  }

  lock() {
    let c = Atomics.compareExchange(this.sab_, ATOM_INDEX, 0, 1);
    // If the lock was previously unlocked, there's nothing else for us to do.
    // Otherwise, we'll probably have to wait.
    if (c !== 0) {
      do {
        // If the mutex is locked, we signal that we're waiting by setting the
        // atom to 2. A shortcut checks is it's 2 already and avoids the atomic
        // operation in this case.
        if (c === 2 || Atomics.compareExchange(this.sab_, ATOM_INDEX, 1, 2) != 0) {
          // Here we have to actually sleep, because the mutex is actually
          // locked. Note that it's not necessary to loop around Atomics.wait;
          // a spurious wakeup or "not-equal" will do no harm since we only
          // exit the do...while loop when sab_[ATOM_INDEX] is indeed 0.
          Atomics.wait(this.sab_, ATOM_INDEX, 2);
        }
        // We're here when either:
        // (a) the mutex was in fact unlocked (by an intervening thread).
        // (b) we slept waiting for the atom and were awoken.
        //
        // So we try to lock the atom again. We set teh state to 2 because we
        // can't be certain there's no other thread at this exact point. So we
        // prefer to err on the safe side.
      } while ((c = Atomics.compareExchange(this.sab_, ATOM_INDEX, 0, 2)) != 0);
    }
  }

  unlock() {
    if (Atomics.sub(this.sab_, ATOM_INDEX, 1) != 1) {
      Atomics.store(this.sab_, ATOM_INDEX, 0);
      Atomics.wake(this.sab_, ATOM_INDEX, 1);
    }
  }
}

Crucial to the performance of this mutex is that the Atomics.wait only does something expensive like waiting when there is actually contention, i.e. when sab[ATOM_INDEX] == 2. The API checks for the condition and decides to wait or not wait in one atomic action. When sab[ATOM_INDEX] != 2, it returns "not-equal" immediately for another go around the do-while loop.That fail-fast behavior lets the mutex deal with fast unlocking: if another thread unlocked the mutex in the meantime, we didn't do an expensive wait for nothing and degraded the performance of our multithreaded code.

Performance Footgun with Always Returning a Promise

Now imagine that Atomics.wait replaced with Atomics.waitAsync that always returns a Promise:

[...]

  async lock() {
    let c = Atomics.compareExchange(this.sab_, ATOM_INDEX, 0, 1);
    if (c !== 0) {
      do {
        if (c === 2 || Atomics.compareExchange(this.sab_, ATOM_INDEX, 1, 2) != 0) {
          // How long to wait if sab[ATOM_INDEX] != 2?
          await Atomics.waitAsync(this.sab_, ATOM_INDEX, 2);
        }
      } while ((c = Atomics.compareExchange(this.sab_, ATOM_INDEX, 0, 2)) != 0);
    }
  }

[...]

If Atomics.waitAsync always returns a Promise, await Atomics.waitAsync(...) always takes at least 1 microtask tick. This could be slow enough that it destroys the performance model of the mutex.

We can work around it by doing another Atomics.load to decrease the chances of the value from changing, but it's not a perfect workaround. There will always be a TOCTOU problem:

[...]

  async lock() {
    let c = Atomics.compareExchange(this.sab_, ATOM_INDEX, 0, 1);
    if (c !== 0) {
      do {
        if (Atomics.load(this.sab_, ATOM_INDEX) === 2 ||
            Atomics.compareExchange(this.sab_, ATOM_INDEX, 1, 2) != 0) {
          // 99% of the time, Atomics.waitAsync will see sab[ATOM_INDEX] == 2,
          // but rarely it might not and the user experiences an extreme slowdown.
          await Atomics.waitAsync(this.sab_, ATOM_INDEX, 2);
        }
      } while ((c = Atomics.compareExchange(this.sab_, ATOM_INDEX, 0, 2)) != 0);
    }
  }

[...]

This PR

The alternative is to sometimes return a Promise so the mutex code above can continue to "fail fast" when it doesn't need to actually wait. Since conditional waiting is the whole point of the futex wait API, this conditionally returning a Promise is consistent with that mental model: just in the case when it needs to wait, you get a Promise that's fulfilled when the wait is finished.

Last time this came up in SYNC-RESOLVE, the decision was that returning sometimes a Promise and sometimes a string is bad API design and is inconsistent with both JS APIs like Promise.all and all WebIDL APIs, which either don't return a Promise or always return a Promise.

This PR instead, on @bakkot's suggestion, returns a wrapped result object like iterator results:

// Assume sab[0] === 42
console.log(Atomics.waitAsync(sab, 0, 41));
// Output: { async: false, value: "not-equal" }
console.log(Atomics.waitAsync(sab, 0, 42));
// Output: { async: true, value: Promise {<pending>} }
console.log(Atomics.waitAsync(sab, 0, 42, /* timeout */ 0));
// Output: { async: false, value: "timed-out" } }

Note that in the async case, the Promise can only be resolved with "ok" or "timed out".

By wrapping, users that don't care about the optimization opportunity of failing fast on "not-equal" can await .value:

await Atomics.waitAsync(sab, index, value, timeout).value;

Since it'll no longer always return a Promise, argument validation errors will remain synchronous.

Why It Should Be a Decision We Make Now

The alternatives are:

Always return a Promise, live with workaround since it'll work almost all the time.
Always return a Promise, maybe add a fail-fast version in the future.
Return a wrapped result object.

(1) is undesirable because it goes against the point of the API for the primary use case: writing mutexes, where we know the performance does matter. I'd rather not ship an API that requires a workaround for the primary use case.

(2) is undesirable because Atomics.waitAsync always returning a Promise precludes amending the method with flags in the future that can fail fast. That points to that if the performance use case becomes apparently important after Atomics.waitAsync ships, the only way to accommodate is to add a 3rd method in addition to both Atomics.wait and Atomics.waitAsync.

(3) is this PR. It's undesirable because that makes this API more complicated and harder to use, though it is a more niche and power-user API.

ljharb commented 4 years ago

I'm not even sure you need the async property if value will always either be a string or a Promise?

bakkot commented 4 years ago

@ljharb, did you see:

Last time this came up in SYNC-RESOLVE, the decision was that returning sometimes a Promise and sometimes a string is bad API design and is inconsistent with both JS APIs like Promise.all and all WebIDL APIs, which either don't return a Promise or always return a Promise.

(which is also my position, though not held all that strongly)

bakkot commented 4 years ago

Or are you suggesting having just { value: string } and { value: Promise }, and making people do the typeof test themselves? Personally I would strongly prefer we avoid returning untagged unions, as a rule.

ljharb commented 4 years ago

Yes, that was my suggestion.

syg commented 4 years ago

Personally I would strongly prefer we avoid returning untagged unions, as a rule.

That's my opinion as well.

anba commented 4 years ago

FWIW, to a certain extent engines can optimise this to skip the extra microtask tick.

(I don't want to imply that other engines are required to implement a similar optimisation. And "to a certain extent" also means that this optimisation isn't always possible, so there can be performance cliffs.)

syg commented 4 years ago

FWIW, to a certain extent engines can optimise this to skip the extra microtask tick.

Indeed. The operative condition there is that the microtask queue is empty at the point of the await -- a good optimization when we can get it, but still prone to the performance footgun in the performance sensitive code that Atomics is likely to be used for.

syg commented 4 years ago

This received consensus in the April 2020 TC39, so merging.

tc39 / proposal-atomics-wait-async