moq-wg / moq-transport

draft-ietf-moq-transport
Other
87 stars 22 forks source link

Error codes for streams being reset #481

Open vasilvv opened 4 months ago

vasilvv commented 4 months ago

We currently do not define error codes with which streams are supposed to be reset.

I can think of at least four of those:

  1. GENERIC -- for unknown reasons
  2. SUBSCRIPTION_GONE -- the subscription associated with the stream has been terminated.
  3. TIMEOUT -- for stream timeouts (possibly defer until we have a PR ready for that).
  4. STREAM_LIMIT_EXCEEDED -- the associated subscription has too many streams associated with it, so we discarded the lowest priority one (probably should be a part of #462).
fluffy commented 4 months ago

+1

The only thing I would suggest is git rid of GENERIC and make error codes for whatever needs one.

TimEvens commented 4 months ago

We need to discuss stream transition and if we use FIN or RESET for that. Stream transition is when we are moving to a new group. Maybe we have an error code for transition to new stream, but right now we use zero for the app code reason. This was an interop issue as some folks assumed, because the draft-05 doesn't indicate otherwise, that upon transition to a new group you close by FIN. That doesn't work because stuff gets stuck inflight with FIN. RESET allows us to immediately move to a new group, which might be to specifically mitigate some head of line blocking issue caused by say retransmissions, ...

Regardless, the above reads as informational. We need to indicate that it's not just informational. If you receive a timeout, shouldn't you clean up the state of the subscription? Do you send an UNSUBSCRIBE still or is it enough that the timeout means the relay/pub/server has cleaned up the subscription and UNSUBSCRIBE isn't needed anymore?.?

vasilvv commented 4 months ago

[...] upon transition to a new group you close by FIN. That doesn't work because stuff gets stuck inflight with FIN.

It is possible to reset a stream that has already been closed with a FIN (in fact, WebTransport WG went out of its way to support that). That's why the per-stream timeouts (like the ones defined in #449) should be kept around even after FIN is sent.

If you receive a timeout, shouldn't you clean up the state of the subscription?

The TIMEOUT I was referencing above is timeout for a single group/object, not necessarily the track as a whole.

afrind commented 4 months ago

Individual Comment:

If you receive a timeout, shouldn't you clean up the state of the subscription

If I squint I think Victor was suggesting the TIMEOUT code would be sent on a stream where you started sending an object but then you passed the delivery timeout (which is in the process of being defined)? I don't think resetting any particular stream should affect subscription state, except perhaps stream-per-track implicitly?

Also, is there a different set of codes we need for STOP_SENDING?

TimEvens commented 4 months ago

The TIMEOUT I was referencing above is timeout for a single group/object, not necessarily the track as a whole.

We need to be very clear on the error codes, when to use them for which stream (e.g., is it for a data uni stream and/or control stream?), and what it means when you receive that based on the mode of the data uni stream, ... As you mentioned above with per-group, you should only close out that group and wait for the next one in the subscription, but in per-track we should likely close out the subscription maybe as there is only one way to recover from that.

fluffy commented 4 months ago

We should open a separate issue on when to FIN / RST a stream. I feel pretty strongly that when you go to a new group, you should not RST the previous group because this would result in loss of data when there was no need to loose data. Experiments back this up. Instead, when doing stream per group, just FIN the stream and then when it hits the timeout of the last object, then it would be appropriate to RST the group. I want to be clear that we have talked a lot about how to detect the end of group and start of a the next group does not necessarily mean end of the previous group. That might be the case with H264 but sees less likely with things like long term reference frames in future codecs