WICG / observable

Observable API proposal
https://wicg.github.io/observable/
Other
546 stars 12 forks source link

Timing design options for `Promise`-returning operator abortion #96

Closed domfarolino closed 5 days ago

domfarolino commented 6 months ago

Observables have a number of Promise-returning operators, and consequently there are a few subtle timing decisions we could make when it comes to:

The timing of this is observable (pun intended) in the case where the AbortSignal passed into a Promise-returning operator gets aborted, and other Promises get scheduled during subscription cancellation. Consider:

const observable = new Observable(subscriber => {
  subscriber.signal.addEventListener('abort', e => {
    console.log('inner signal abort event');
    Promise.reject().catch(e => console.log('inner signal abort promise rejection'));
  });

  subscriber.addTeardown(() => {
    console.log('teardown');
    Promise.reject().catch(e => console.log('teardown promise rejection'));
  });
});

const controller = new AbortController();
controller.signal.addEventListener('abort', e => {
  console.log('outer signal abort event');
  Promise.reject().catch(e => console.log('outer signal abort promise rejection'));
});
const promise = observable.toArray({signal: controller.signal});
promise.catch(e => console.log('toArray() promise rejection'));

controller.abort();

Tie Promise rejection to abortion of the passed-in SubscribeOptions's signal

Concretely, this means scheduling the rejection of the returned Promise in an abort algorithm added to the "outer" signal passed into the operator. This would produce the following log:

<Schedule toArray() promise rejection, via an outer abort signal algorithm>
outer signal abort event
<Schedule 'outer signal abort' promise rejection>
teardown
<Schedule 'teardown' promise rejection>
inner signal abort event
<Schedule 'inner signal abort' promise rejection>

toArray() promise rejection
outer signal abort promise rejection
teardown promise rejection
inner signal abort promise rejection

Tie Promise rejection to the abortion of the signal's subscription

Concretely, this approach would mean scheduling the rejection of the returned Promise from within an abort algorithm added to the "inner" signal, that is a dependent signal on the one passed in. This would produce the following log:

<All abort algorithms on the outer signal run: nothing happens>
outer signal abort event
<Schedule 'outer signal abort' promise rejection>
teardown
<Schedule 'teardown' promise rejection>
<Schedule toArray() promise rejection, via an inner abort signal algorithm>
inner signal abort event
<Schedule 'inner signal abort' promise rejection>

outer signal abort promise rejection
teardown promise rejection
toArray() promise rejection
inner signal abort promise rejection

I find the first option more consistent, where the Promise returned from the operator rejects before any of the other inner promises. I think this is consistent with the fact that the passed-in signal is mentally at the same level of hierarchy as the returned Promise it controls (they are both "outer" things, not "inner" subscription internals). Given that, I would expect all things to happen to the outer signal/Promise pair before things associated with the inner subscription start firing and happening.

This is probably the route I'll pursue in spec/implementation/WPTs for now, but I at least wanted to document this subtle design decision here in the form of a discussion in case there are any strong opinions about the ordering. /cc @benlesh

benlesh commented 6 months ago

I modelled this in the reference impl, but basically, I'd prioritized teardowns THEN abort. As one is for cancellation, and the other is for the more important task of clean up. I'd also execute teardowns in LIFO order. Because usually things need to be torn down in the opposite order they were set up in (for example: end transaction, then close DB). This also mirrors DisposableStack.

domfarolino commented 6 months ago

What do you mean by:

I'd prioritized teardowns THEN abort.

We don't have any control over this. Per DOM abort signal "algorithms" always run before abort signal "events" are fired. This means teardowns (which are run as a part of an abort signal "algorithm") will always run before abort events are fired.

But that's not the question at hand. The question at hand is about the scheduling of Promise rejection, for Promises returned from the operators.

I'd also execute teardowns in LIFO order. Because usually things need to be torn down in the opposite order they were set up in (for example: end transaction, then close DB). This also mirrors DisposableStack.

This is not relevant to this issue.

benlesh commented 5 months ago

What I mean, more specifically, is that if subscriber.addTeardown is inherently wired to the same signal that is on the subscriber at subscriber.signal, and handler added to subscriber.signal should fire after any handler added to addTeardown.

Also, handlers added to addTeardown should be handled in LIFO order. This is because teardown use cases are commonly associated to setup ordering in reverse. "Connect to the database, then create a transaction, then create a query, then create a recordset" etc... you tear those down in reverse order, usually. RxJS doesn't do this yet, and it's been a big mistake, IMO.

const source = new Observable(subscriber => {
  subscriber.signal.addEventListener('abort', () => console.log('cancellation 1'));
  subscriber.addTeardown(() => console.log('teardown 1'));
  subscriber.addTeardown(() => console.log('teardown 2'));
  subscriber.signal.addEventListener('abort', () => console.log('cancellation 2'));
});

const ac = new AbortController();
ac.signal.addEventListener('abort', () => console.log('outer cancellation'));

source.subscribe(() => {}, { signal: ac.signal });

ac.abort();

// Logs
// outer cancellation
// teardown 2
// teardown 1
// cancellation 1
// cancellation 2

I think your WPT cover this, but I want to make sure it's understood to anyone that comes to read this.

benlesh commented 5 months ago

And just to clarify for relevance to the issue... I would expect all of the above logs to show up synchronously when you call ac.abort(). How that plays out WRT the rejected promise seems irrelevant, I suppose, but I given your examples above, I just wanted to make sure the algorithm was clear.

So given your example, which I've annotated a bit:

const observable = new Observable(subscriber => {
  // When we abort the subscription, this handler will
  // synchronously fire SECOND
  subscriber.signal.addEventListener('abort', e => {
    console.log('inner signal abort event');

    // This microtask is scheduled SECOND
    Promise.reject().catch(e => console.log('inner signal abort promise rejection'));
  });

  // When we abort the subscription, this handler will
  // synchronously fire FIRST
  subscriber.addTeardown(() => {
    console.log('teardown');

    // This microtask is scheduled FIRST
    Promise.reject().catch(e => console.log('teardown promise rejection'));
  });
});

const controller = new AbortController();

// This handler will be fired synchronously when abort()
// is called. It's wired up before the `toArray` is called,
// so I'd expect this to fire before everything.
controller.signal.addEventListener('abort', e => {
  console.log('outer signal abort event');

  // This microtask will be scheduled before the others, because
  // the handler it was created it was fired first.
  Promise.reject().catch(e => console.log('outer signal abort promise rejection'));
});
const promise = observable.toArray({signal: controller.signal});

promise.catch(e => console.log('toArray() promise rejection'));

controller.abort();

I'd expect:

"outer signal abort event"
"teardown"
"inner signal abort event"

"outer signal abort promise rejection"
"teardown promise rejection"
"inner signal abort promise rejection"
"toArray() promise rejection"
domfarolino commented 5 months ago

What I mean, more specifically, is that if subscriber.addTeardown is inherently wired to the same signal that is on the subscriber at subscriber.signal, and handler added to subscriber.signal should fire after any handler added to addTeardown.

This sentence doesn't quite make sense to me, but I think I know what you mean, but if I misinterpret, please correct me. It is true that teardowns will always execute before the abort event is fired on any given subscriber.signal; that's just how AbortSignals work (see that the firing of the abort event is the very last thing done, modulo aborting dependent signals).

But that's not what this issue is about. The issue is about what signal the outer promise's rejection is attached to — the edit in your most recent comment covers this a bit more. Basically, the rejection of the returned Promise will be done as an algorithm associated with an AbortSignal, its just that there are two signals involved for any Promise-returning operator: (1) the signal passed into the API, and (2) the signal that lives on the "inner" Subscriber.

If the rejection of the returned Promise is an algorithm on the outer signal, then when the outer signal aborts, we will do the following:

  1. Outer signal "algorithms" run:
    1. Reject the returned promise, which queues a microtask to fire the rejection handler
  2. Fire an abort event at the outer signal, which invokes any event handlers
    1. Those abort event handlers might queue microtasks themselves, which will of course run after any already-queued microtasks (i.e., run after the promise rejection handler)
  3. Abort any dependent signals (i.e., the inner signal, that is subscriber.signal)

If instead rejection of the returned Promise is an algorithm on the inner signal, then when the outer signal aborts, we will do the following:

  1. Outer signal "algorithms" run. There are none; do nothing
  2. Fire an abort event handler at the outer signal, which invokes any event handlers
  3. Those abort event handlers might queue microtasks themselves. These would be the first microtasks in the microtask queue
  4. Abort any dependent signals (i.e., the inner signal, that is subscriber.signal)
  5. Inner signal "algorithms" run:
    1. Reject the returned promise, which queues a microtask to fire the rejection handler; the rejection handler microtask will run after any microtasks queued above
  6. Fire an abort event handler at the inner signal, which invokes any event handlers
    1. Those abort event handlers might queue microtasks themselves, which will of course run last, since they are the final ones to be queued

That's why I think the two options I provided in the OP are the only possible choices we have. I don't believe the final output provided in https://github.com/WICG/observable/issues/96#issuecomment-1932712512 after "I'd expect" is valid, since the Promise rejection microtask would never run after microtasks queued in the last abort event handler.

domfarolino commented 5 days ago

After https://github.com/WICG/observable/pull/154, much of the discussion above is mostly irrelevant since we will no longer be using dependent AbortSignals to signal abort in a chain of Observables.

The timing issue of which AbortSignal to tie the rejection of the returned Promise to is still relevant, but I think in light of the timing decisions made in #154, it only makes sense to tie rejection to the "downstream" (i.e., outer, developer-supplied) SubscribeOptions#signal. This is what the spec, Chromium implementation, and tests do/expect. So I'll close this issue.