ably / specification

The Ably features spec for client library SDKs.
Apache License 2.0
0 stars 4 forks source link

Reconsider criteria for a channel `ATTACHED` to be considered a contributor discontinuity #239

Open lawrence-forooghian opened 1 day ago

lawrence-forooghian commented 1 day ago

Tweaking the existing criteria…

*** @(CHA-RL4a2)@ @[Testable]@ If the given contributor has not yet successfully managed to attach its Realtime Channel (i.e. no call to @attach()@ on the channel, per @CHA-RL1f@, has succeeded), then no action should be taken (i.e. @CHA-RL4a3@ and @CHA-RL4a4@ behaviours are not performed).
*** @(CHA-RL4a3)@ @[Testable]@ If a room lifecycle operation is in progress, then a pending discontinuity event shall be recorded for this contributor - though it must not overwrite any existing discontinuity event. The @ErrorInfo@ associated with the discontinuity shall be the @reason@ for the underlying channel state change. The event will be notified to the contributor at a later point, as noted in this specification.
*** @(CHA-RL4a4)@ @[Testable]@ If a room lifecycle operation is not in progress, then a discontinuity event will immediately be emitted to the contributor. The @ErrorInfo@ associated with the discontinuity shall be the @reason@ for the underlying channel state change.

Here, I think that we’re trying to deal with the fact that a channel’s initial ATTACHED state change has resumed == false even though this state change does not represent a discontinuity in channel messages.

However, CHA-RL4a2’s criterion for deciding which ATTACHED state change is the initial one assumes that the room will process the channel’s first ATTACHED status change before the caller of attach() regains control and updates the room’s internal state to mark the contributor as successfully-attached. I don't think that this is necessarily a valid assumption. Hence, we might end up thinking that the initial ATTACHED actually represents a discontinuity.

So, I think that the lightest-touch approach to fix CHA-RL4a2 is to use a different criterion for deciding whether the channel has already been attached; namely whether an ATTACHED state change has previously been received on that channel.

…or reconsidering the criteria

But, I wonder if there’s an easier way to decide which ATTACHED channel state changes represent a discontinuity. My reading of RTN15c is that all discontinuity state changes will be accompanied by an error, so couldn’t we instead say that an ATTACHED state change is a discontinuity if it has resumed == false and reason != nil? This would also allow us to tighten up the JS discontinuityDetected(…) API to make its argument non-optional.

AndyTWF commented 1 day ago

Here, I think that we’re trying to deal with the fact that a channel’s initial ATTACHED state change has resumed == false even though this state change does not represent a discontinuity in channel messages.

The other aspect is: do we want to register a discontinuity on a feature if the channel that powers it has never successfully completed an attach operation?

My reading of RTN15c is that all discontinuity state changes will be accompanied by an error, so couldn’t we instead say that an

RTL12 unfortunately is laxer on this - says "if any". So whilst I'd hope that any discontinuity event would have an error, the protocol doesn't guarantee it.

before the caller of attach() regains control and updates the room’s internal state to mark the contributor as successfully-attached

It is a guarantee in JS - as the event listeners are executed in order, meaning that the state monitor will be called before the listener registered in relation to the attach call. But I appreciate that probably doesn't hold in other languages where concurrency is a thing - or does it?

so couldn’t we instead say that an ATTACHED state change is a discontinuity if it has resumed == false and reason != nil

At the moment at least (I am thinking about this more broadly atm) - we raise discontinuities if you explicitly detach a room and then bring it back again. I want to revisit this, but at the moment this wouldn't work for that reason (as reason would be nil in these cases).

That said, I think a rewording of CHA-RL4a2 to get it from the channel state change and not the attach call is reasonable.

WDYT?

lawrence-forooghian commented 1 day ago

The other aspect is: do we want to register a discontinuity on a feature if the channel that powers it has never successfully completed an attach operation?

Yeah, that logic makes sense.

RTL12 unfortunately is laxer on this - says "if any". So whilst I'd hope that any discontinuity event would have an error, the protocol doesn't guarantee it.

we raise discontinuities if you explicitly detach a room and then bring it back again. I want to revisit this, but at the moment this wouldn't work for that reason (as reason would be nil in these cases).

Thanks for pointing these out; sounds like I need to remove the assertion failure that the Swift SDK currently has if you are meant to emit a discontinuity but there isn't an accompanying error.

It is a guarantee in JS - as the event listeners are executed in order, meaning that the state monitor will be called before the listener registered in relation to the attach call. But I appreciate that probably doesn't hold in other languages where concurrency is a thing - or does it?

So, the way this is currently implemented in Swift, at least, is that to a rough approximation there are two separate threads; one to process state changes and one that calls attach() on the contributor. So, roughly, imagine that:

  1. we call attach() on the latter thread
  2. attach() attaches and emits state changes
  3. attach()returns, and we mark the contributor as having attached
  4. the state-handling thread gets scheduled and receives the initial ATTACHED state change

(I’m sure there are ways of writing this where you don't run into this issue, but the pattern I’ve described is the one that Swift’s concurrency features kind of naturally lead you — well, naturally led me — to employ.)

That said, I think a rewording of CHA-RL4a2 to get it from the channel state change and not the attach call is reasonable.

That would make my life easier 🙂