Unordered, unreliable mode: Surprising behavior of packet dropping

alvestrand commented 3 years ago

Please try the following:

Open an usrsctp connection
Turn off the network
Send 200 packets with ordered=false, maxRetransmits = 0
Turn on the network

Expected result (from a naive reading of the API): All packets disappear. Observed result (in a webrtc test): The first 110 packets get dropped, the rest get delivered.

Same result is observed when setting maxPacketLifetime = 1, but when maxPacketLifetime = 0, all packets get delivered.

This behavior causes problems for one of our uses, who is sending a fast stream of events, and expected usrsctp to take care of the dropping of packets when network dropouts happen; instead, there's a massive influx of packets when the network reestablishes.

Thoughts? Is this according to spec? Is this according to what the user should expect?

alvestrand commented 3 years ago

tag @orphis for interest

tuexen commented 3 years ago

Please try the following:

Open an usrsctp connection

Turn off the network

Send 200 packets with ordered=false, maxRetransmits = 0

Turn on the network

Expected result (from a naive reading of the API): All packets disappear.

I guess you expect that SCTP sends the 200 packets out, if the upper layer provides them by calling 200 times send(). The crucial point here is that when SCTP can actually send out is limited by the congestion control, the flow control, and there is also a burst mitigation.

So I guess what happens is that congestion control and flow control allow that the first 110 packets get sent when the network connectivity does not exist. Once the network connectivity is re-established, the first 110 do not get retransmitted, since maxRetransmits is 0, but the remaining ones now can get transmitted for the first time, since congestion and flow control allows it now.

I think this is completely as described in specifications.

Observed result (in a webrtc test): The first 110 packets get dropped, the rest get delivered.

Same result is observed when setting maxPacketLifetime = 1, but when maxPacketLifetime = 0, all packets get delivered.

I guess you expect that when using maxPacketLifetime of 1ms, no packets go on the wire (either by transmitting them or by re-transmitting them) after 1ms of the point of time they were given to the SCTP stack by the send() call. I would agree with this view, but the code as it is right now only applies the partial reliability policy for retransmissions. Therefore you observe the same behaviour as in the case where you specify maxRetransmits as 0. This can be, and I think it should be fixed.

I guess you expect that when using maxPacketLifetime of 0ms, packets only go on the wire if the SCTP stack can send them out directly when they are given to the SCTP stack without any buffering or delaying. I just checked the W3C specification and I would agree with this expectation. However, I guess your implementation just maps the value provided for maxPacketLifetime to a corresponding in the SCTP socket API. In some fields (for example sinfo_timetolive) a value of 0 has the special meaning of reliable. The reason for this special case is that for a long time timed reliability was the only PR-SCTP policy and one needed to way to the timout or indicate that the message is reliable. However, this behaviour is also applied in other places where such a timeout is specified, but the specification does not explicitly require this special handling. I think the API you are used does not require the special handling.

This behavior causes problems for one of our uses, who is sending a fast stream of events, and expected usrsctp to take care of the dropping of packets when network dropouts happen; instead, there's a massive influx of packets when the network reestablishes.

OK, I see. So you actually want to limit the lifetime of the packets and would need that the PR-SCTP policies are also applied to retransmissions.

Thoughts? Is this according to spec? Is this according to what the user should expect?

Do the above explanations make things clearer? I do understand that you are not happy with the current state of the implementation...

alvestrand commented 3 years ago

thanks - that's more or less what I had intuited!

Can you cite spec chapter and verse on retransmissions only being counted when there's actually been a transmission? We're going to be asked about whether we can do a different behavior, and it's good to be able to cite a reference.

The special treatment of 0 seems like something that we can either warn the users about or map 0 -> 1 when the user specifies 0. The API used is here:

  struct sctp_sendv_spa spa = {0};
...
  // Ordered implies reliable.
  if (!params.ordered) {
    spa.sendv_sndinfo.snd_flags |= SCTP_UNORDERED;
    if (params.max_rtx_count >= 0 || params.max_rtx_ms == 0) {
      spa.sendv_flags |= SCTP_SEND_PRINFO_VALID;
      spa.sendv_prinfo.pr_policy = SCTP_PR_SCTP_RTX;
      spa.sendv_prinfo.pr_value = params.max_rtx_count;
    } else {
      spa.sendv_flags |= SCTP_SEND_PRINFO_VALID;
      spa.sendv_prinfo.pr_policy = SCTP_PR_SCTP_TTL;
      spa.sendv_prinfo.pr_value = params.max_rtx_ms;
    }
  }

So max_rtx_ms == 0 is supposed to be mapped to max_rtx_count == 0, which is not the behavior I observe. Worth debugging on our side.

So I read this as:

Behavior for max_rtx_count is WAI, and will not change
Behavior for max_rtx_ms should and will be changed to drop packets that are > x ms old (measured from the time of the send() call) even when they have never been transmitted before.

Am I reading this right?

tuexen commented 3 years ago

thanks - that's more or less what I had intuited!

Can you cite spec chapter and verse on retransmissions only being counted when there's actually been a transmission? We're going to be asked about whether we can do a different behavior, and it's good to be able to cite a reference.

Limited Retransmissions Policy is the text specifying the PR-SCTP policy related to maxRetransmits. It clearly only talks about retransmissions, not transmissions. Is that the text you are looking for?

The special treatment of 0 seems like something that we can either warn the users about or map 0 -> 1 when the user specifies 0. The API used is here:

  struct sctp_sendv_spa spa = {0};
...
  // Ordered implies reliable.
  if (!params.ordered) {
    spa.sendv_sndinfo.snd_flags |= SCTP_UNORDERED;
    if (params.max_rtx_count >= 0 || params.max_rtx_ms == 0) {
      spa.sendv_flags |= SCTP_SEND_PRINFO_VALID;
      spa.sendv_prinfo.pr_policy = SCTP_PR_SCTP_RTX;
      spa.sendv_prinfo.pr_value = params.max_rtx_count;
    } else {
      spa.sendv_flags |= SCTP_SEND_PRINFO_VALID;
      spa.sendv_prinfo.pr_policy = SCTP_PR_SCTP_TTL;
      spa.sendv_prinfo.pr_value = params.max_rtx_ms;
    }
  }

I don't understand the "Ordered implies reliable" comment. You can have an ordered and unreliable data channel, at least this is my understanding for reading the W3C specification and this is also what SCTP would support.

Until the usrsctp library is improved, you might what to map 0 -> 1 if the user specifies 0.

The code you cite reads strange. I would expect something like:

spa.sendv_sndinfo.snd_flags = 0;
if (!params.ordered) {
    spa.sendv_sndinfo.snd_flags |= SCTP_UNORDERED;
}
/* At most one of params.max_rtx_count and params.max_rtx_ms can be non-negative. */
if (params.max_rtx_count >= 0) {
      spa.sendv_flags |= SCTP_SEND_PRINFO_VALID;
      spa.sendv_prinfo.pr_policy = SCTP_PR_SCTP_RTX;
      spa.sendv_prinfo.pr_value = params.max_rtx_count;
}
if (params.max_rtx_ms >= 0) {
    /* Temporary work around a bug in usrsctp interpreting 0 a reliable. */
    if (params.max_rtx_ms == 0) {
        params.max_rtx_ms = 1;
    }
    spa.sendv_flags |= SCTP_SEND_PRINFO_VALID;
    spa.sendv_prinfo.pr_policy = SCTP_PR_SCTP_TTL;
    spa.sendv_prinfo.pr_value = params.max_rtx_ms;
}

So max_rtx_ms == 0 is supposed to be mapped to max_rtx_count == 0, which is not the behavior I observe. Worth debugging on our side.

So I read this as:

Behavior for max_rtx_count is WAI, and will not change

Correct.

Behavior for max_rtx_ms should and will be changed to drop packets that are > x ms old (measured from the time of the send() call) even when they have never been transmitted before.

Correct.

Am I reading this right?

Yes. Please have a look at the code you cited...

alvestrand commented 3 years ago

I filed https://crbug.com/webrtc/12730 on the "ordered" issue. We'll fix that on our side.

tuexen commented 3 years ago

I filed https://crbug.com/webrtc/12730 on the "ordered" issue. We'll fix that on our side.

Thanks. But please take also a look at

    if (params.max_rtx_count >= 0 || params.max_rtx_ms == 0) {
      spa.sendv_flags |= SCTP_SEND_PRINFO_VALID;
      spa.sendv_prinfo.pr_policy = SCTP_PR_SCTP_RTX;
      spa.sendv_prinfo.pr_value = params.max_rtx_count;
    } else {
      spa.sendv_flags |= SCTP_SEND_PRINFO_VALID;
      spa.sendv_prinfo.pr_policy = SCTP_PR_SCTP_TTL;
      spa.sendv_prinfo.pr_value = params.max_rtx_ms;
    }

If params.max_rtx_count is negative and params.max_rtx_ms is zero, the code selects the limited retransmissions policy and sets spa.sendv_prinfo.pr_value to a large value, since pr_value has uint32_t as its type and negative numbers are large when considered as unsigned numbers.

Please note that even if usrsctp would interpret maxPacketLifetime of zero as you (and I) would expect, the code above transforms it into a lot of retransmissions, which actually would mean reliable transport. So you are not even hitting the code in usrsctp I was talking about. Or am I missing something?

alvestrand commented 3 years ago

Good catch! The source type is "unsigned int" a few abstraction layers above, with "-1" being used for "not set". So if max_rtx_count is not set, and max_rtx_ms is set to zero .... reliable (sort of)!

Will fix this code.

tuexen commented 3 years ago

Thanks. Let me double check if the SCTP stack actually handles pr_value = 0 for pr_policy = SCTP_PR_SCTP_TTL. Maybe I fixed it already sometime ago. A quick look at the code seems to indicate that. Then the temporary workaround:

    /* Temporary work around a bug in usrsctp interpreting 0 a reliable. */
    if (params.max_rtx_ms == 0) {
        params.max_rtx_ms = 1;
    }

would not be necessary. Will report here...

tuexen commented 3 years ago

OK, I did some testing. The following packetdrill script runs successfully:

 0.000 `sysctl -i net.inet.sctp.ecn_enable=0`
+0.000 `sysctl -i net.inet.sctp.pr_enable=1`
+0.000 `sysctl -i net.inet.sctp.asconf_enable=0`
+0.000 `sysctl -i net.inet.sctp.pktdrop_enable=0`
+0.000 `sysctl -i net.inet.sctp.reconfig_enable=0`
+0.000 `sysctl -i net.inet.sctp.auth_enable=0`
+0.000 `sysctl -i net.inet.sctp.nrsack_enable=0`
+0.000 socket(..., SOCK_STREAM, IPPROTO_SCTP) = 3
+0.000 fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
+0.000 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
+0.000 bind(3, ..., ...) = 0
+0.000 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0.000 > sctp: INIT[flgs=0, tag=1, a_rwnd=..., os=..., is=..., tsn=1, FORWARD_TSN_SUPPORTED[], 
                                                                      SUPPORTED_EXTENSIONS[types=[FORWARD_TSN]],
                                                                      SUPPORTED_ADDRESS_TYPES[types=[IPv4]]]
+0.010 < sctp: INIT_ACK[flgs=0, tag=2, a_rwnd=1500, os=16, is=16, tsn=1, STATE_COOKIE[len=4, val=...],
                                                                         FORWARD_TSN_SUPPORTED[]]
+0.000 > sctp: COOKIE_ECHO[flgs=0, len=4, val=...]
+0.010 < sctp: COOKIE_ACK[flgs=0]
+0.000 sctp_sendv(3, [{..., 1000}], 1, ..., 1, {sendv_flags=SCTP_SEND_SNDINFO_VALID|SCTP_SEND_PRINFO_VALID,
                                                sendv_sndinfo={snd_sid=0,
                                                               snd_flags=SCTP_UNORDERED,
                                                               snd_ppid=htonl(1234),
                                                               snd_context=0,
                                                               snd_assoc_id=0},
                                                sendv_prinfo={pr_policy=SCTP_PR_SCTP_TTL,
                                                              pr_value=0},
                                                sendv_authinfo={auth_keynumber=0}},
                                               32, SCTP_SENDV_SPA, 0) = 1000
+0.000 > sctp: DATA[flgs=UBE, len=1016, tsn=1, sid=0, ssn=0, ppid=1234]
*      > sctp: FORWARD_TSN[cum_tsn=1, ids=[]]
*      > sctp: FORWARD_TSN[cum_tsn=1, ids=[]]
+0.100 < sctp: SACK[flgs=0, cum_tsn=1, a_rwnd=10000, gaps=[], dups=[]]
+1.000 close(3) = 0
+0.000 > sctp: SHUTDOWN[flgs=0, cum_tsn=0]
+0.010 < sctp: SHUTDOWN_ACK[flgs=0]
+0.000 > sctp: SHUTDOWN_COMPLETE[flgs=0]

This means that using the parameters pr_value = 0 for pr_policy = SCTP_PR_SCTP_TTL is handled correctly in the sense, that the PR-SCTP policy is applied. Therefore:

The temporary workaround is not required.
The bug reported above that using pr_value = 0 for pr_policy = SCTP_PR_SCTP_TTL resulting in reliable transmission is caused by the bug described above in the code running as an upper layer of SCTP and not by a bug in the SCTP implementation.

Sorry for the confusion on my side.

sctplab / usrsctp

Unordered, unreliable mode: Surprising behavior of packet dropping #584