gloinul / draft-westerlund-tsvwg-dtls-over-sctp-bis

Other
0 stars 2 forks source link

DTLS/SCPT interaction with SCTP shutdown need to be clarified #87

Closed gloinul closed 2 years ago

gloinul commented 2 years ago

Daria Ivanova noted some unclarity in the procedures around shutdown. This needs to be clarified and also the implementation implications are relevant. Using a worst case time as might be acceptable for the rekeying with parallel DTLS conencection are not acceptable during graceful shutdown as that will take 10 min then.

The application intiaties a shutdown request to the DTLS/SCTP layer.

  1. DTLS/SCTP Flush its buffer and get any DTLS/SCTP layer buffered user data protected and sent to SCTP.
  2. SCTP sends data and await SACK from peer to know all has been delivered. Then DTLS/SCTP layer can issue DTLS Close Notify.
  3. DTLS/SCTP stack sends CLOSE_notify.
  4. SCTP sends DTLS Close_notify to peer. Now the SCTP stack close can be initiated to get the SCTP stack into SHUTDOWN pending state.
  5. When peer side get DTLS Close_Notify on the last DTLS connection, that is a signal for shutdown procedure.
  6. Then the left does corresponding to 2-4.
  7. SCTP association is shutdown.

So this needs to be clarified in document.

teiclap commented 2 years ago

Do we mean that peer A initiates shutdown, it flushes data, closes DTLS and then waits for peer B to do the same and at the end it will be peer B doing the same and finally asking for SCTP shutdown?

gloinul commented 2 years ago

Do we mean that peer A initiates shutdown, it flushes data, closes DTLS and then waits for peer B to do the same and at the end it will be peer B doing the same and finally asking for SCTP shutdown?

Yes, for a controlled shutdown the DTLS stack must have gotten indication that all its data is delivered, i.e. SCTP SACK messages must be received in A to verify the delivery of all user message so that DTLS can be closed. And first when that is achieved and DTLS close_notify has been sent can the shutdown pending happen. There are some asynchronous behavior here and also depending on SCTP API knowing that the data has been delivered can be challenging for fastest possible controlled shutdown.

teiclap commented 2 years ago

I think that since DTLS/SCTP and SCTP are not tied, it would be better peer A to shutdown the SCTP association once he has sent Close_Notify as it's the actual initiator of the shutdown procedure.

gloinul commented 2 years ago

I think that since DTLS/SCTP and SCTP are not tied, it would be better peer A to shutdown the SCTP association once he has sent Close_Notify as it's the actual initiator of the shutdown procedure.

Yes, I fail to see what in my description do not indicate that?

For peer A DTLS/SCTP to send its close_notify it need to know that all data has been sent by SCTP in peer A at a minimal. I also think for safe controlled close it is required to know that the peer B has actually received the data also, to not result in closing the peers stack can't process any delayed data by the SCTP association encountering packet losses that would be received after the close notify.

emanjon commented 2 years ago

I opened the following issue on the DTLS 1.3 draft

https://github.com/tlswg/dtls13-spec/issues/268

gloinul commented 2 years ago

An error in the above is that the shutdown initiator needs to wait for the DTLS Close notify from the peer before it can enter shutdown-pending. Otherwise the peer's end of the SCTP association will be blocked for sending prior to it having a chance to flush its DLTS/SCTP layer data to SCTP.

@emanjon also realized that the DTLS 1.3 specification is missing text about that out of order delivery, like between streams and the reception of the DTLS close_notify that is needs will not work well. So from DTLS/SCTP perspective what should happen is that the DTLS stack can process the DTLS Close notify, tell DTLS/SCTP of its repecption, which blocks accepting more data from ULP, flush any pending data and send its close notify. However, the DTLS stack should not be closed down until SCTP shutdown has completed, then the keys for the DTLS Session can be flushed.

gloinul commented 2 years ago

@tuexen I think your question in the https://github.com/tlswg/dtls13-spec/issues/268 discussion may be better answered here: "Just wondering... On the sender side you already have reference counting for keys to implement SCTP_AUTH_FREE_KEY in the SCTP_AUTHENTICATION_EVENT. So wouldn't it be a simple extension to add a SCTP_AUTH_DRY_KEY way?"

I don't see how this helps the implementation in any way. The DTLS/SCTP layer is the one owning the DTLS connections. It will know when these have been closed and thus know when it can do SCTP shutdown.

It might be that I am misunderstanding what you propose here so if you want to elaborate?

gloinul commented 2 years ago

Daria pointed out an issue that needs clarifying if the endpoints have two DTLS connections.

  1. Switch over has occurred to the later DTLS Connection, and one is only draining the older one.

  2. In the process of establishing a second DTLS connection and one have not yet switched over.

So in case of 1 and receiving a close notify on the older DTLS connection. Then that endpoint does not yet know if this is a shutdown procedure or simply the closing of the older parallel DTLS conneciton being removed. Thus, it must so far assume that it is close down of the older one only. Only when the close notify is being received on the newer also will one go into shutdown procedure. If one receive the close notify on the newer one but not yet on the older one, I would still not enter shutdown in the remote side although currently it looks like the "local" side have no way of sending additional data. The close_notify on the older DTLS Connection needs to be on its way.

For case 2. I think we need to clarify if one can abort the DTLS connection establishment, or need to conclude the establishment for then immediately close it.

gloinul commented 2 years ago

The PR for this issue is #94.

tuexen commented 2 years ago

@tuexen I think your question in the tlswg/dtls13-spec#268 discussion may be better answered here: "Just wondering... On the sender side you already have reference counting for keys to implement SCTP_AUTH_FREE_KEY in the SCTP_AUTHENTICATION_EVENT. So wouldn't it be a simple extension to add a SCTP_AUTH_DRY_KEY way?"

I don't see how this helps the implementation in any way. The DTLS/SCTP layer is the one owning the DTLS connections. It will know when these have been closed and thus know when it can do SCTP shutdown.

It might be that I am misunderstanding what you propose here so if you want to elaborate?

I was thinking about when to send the CloseNotify message. If you are trying to shut down the last DLTS connection on an SCTP association, you could use the sender dry event to make sure all other user messages have been received by the peer. But if you want to shut down a DTLS connection and still have other DTLS connections being active (and you don't want to affect them), you would need an indication that all user messages belonging to the DTLS connection you try to shutdown have been received by the peer, you could use a (to be defined) key dry event.

I want to delay the sending of the CloseNotify message until all messages belonging to the DTLS connection about to be shut down have been received by the peer.

Does that makes things clearer? Or am I misunderstanding something?

gloinul commented 2 years ago

Thanks @tuexen I now understand. I am not used to thinking in this API. But, to paraphrase. One way is to define an event that fires ones all user messages with a specific SCTP-AUTH key ID has been sent.

tuexen commented 2 years ago

Yes. To be more precisely: an event that fires if no user message in the SCTP stack using this key ID is available anymore.

It is not about "has been sent" but more "has been acked in a non-renegable way".

If you are willing to include this in the document, I'm willing to provide text as long as that stuff will not be covered by an IPR owned by Ericsson. This would allow me to implement it in the FreeBSD stack.

gloinul commented 2 years ago

The PR #94 has been updated to address the case of ongoing DTLS handshake process.