obgm / libcoap

A CoAP (RFC 7252) implementation in C
Other
799 stars 424 forks source link

OSCORE response for CON request is always Non-piggybacked #1494

Open magdalenaszumny opened 2 months ago

magdalenaszumny commented 2 months ago

OSCORE response for Confirmable request is always sent as Seperate response (ACK first, then response with data). This generates additional traffic on network.

It is caused by following code in coap_oscore.c (lines 670 to 688):

 /*
   * If this is a response ACK with data, make it a separate response
   * by sending an Empty ACK and changing osc_pdu's MID and type.  This
   * then allows lost response ACK (now CON) with data to be recovered.
   */
  if (coap_request == 0 && osc_pdu->type == COAP_MESSAGE_ACK &&
      COAP_RESPONSE_CLASS(pdu->code) == 2 &&
      COAP_PROTO_NOT_RELIABLE(session->proto)) {
    coap_pdu_t *empty = coap_pdu_init(COAP_MESSAGE_ACK,
                                      0,
                                      osc_pdu->mid,
                                      0);
    if (empty) {
      if (coap_send_internal(session, empty) != COAP_INVALID_MID) {
        osc_pdu->mid = coap_new_message_id_lkd(session);
        osc_pdu->type = COAP_MESSAGE_CON;
      }
    }
  }

There is a comment explaining why it is implemented like this, but maybe better solution would be to make it optional? What is your opinion?

mrdeep1 commented 2 months ago

This is an interesting one.

If when not using a piggybacked response, the OSCORE response gets "lost" in transmission, then the client re-transmits the CON request as it has not seen the returning ACK. OSCORE treats this re-transmit as a duplicate request and ignores it (or optionally sends back a 401 unauthorized response). See RFC8613 7.4 Replay Protection. So, the client either will eventually time out the request, or sees the 401 response. It is unclear at this point as to whether the server has actually processed the request or not.

Hence the use of a Separate response at this point - which does add 2 empty ACK packets to the round trip time. It could be made optional, but then things will fail if any packet is dropped.

If the server does respond with a 401 (which is optional for the server OSCORE logic), then the client could assume that the server has received and processed the initial request, but that is not guaranteed, and move on to the next request. I don't like that solution.

magdalenaszumny commented 2 months ago

Thank you for quick feedback!

There may be some situation when overall communication time is important and somebody would like to have quicker communication time for happy path. In that case we can accept the risk that we may get 4.01 if the server actually received a request, but the response was lost. We would have to send the request again (with new ssn) or move to next request. In happy path scenario we would save time we need now for sending additional 2 ACKs.

Could I have also a question regarding libcoap client? What does libcoap do in case ACK was received, but RSP is lost? In that case we know that retransmission would cause replay detection, so we should not retransmit the request. Will libcoap client resend request with new SSN? Or somehow inform the application?

mrdeep1 commented 2 months ago

What does libcoap do in case ACK was received, but RSP is lost?

Not sure what you mean by RSP. If we are using Separate responses, then the sequence is

-> CON Request
<- Empty ACK   (A)
<- CON response
-> Empty ACK  (B)

If ACK (A) is dropped, on receipt by client of CON Response, as Request/Response Tokens match, it is assumed that ACK (A) went missing and things continue.

If both ACK (A) and CON Response are sent and go missing, then based on the random retry timeout rules, either the Client or Server (libcoap library code) will retransmit their CON packet. If the Client re-transmits first, then we again have the duplicate packet issue. Hmm.

If ACK (B) goes missing, CON Response is re-transmitted (up to 4 times over a 93 second period) at which point the server gives up.

In happy path scenario we would save time we need now for sending additional 2 ACKs.

I will have a further think about this.

kkrentz commented 2 months ago

In order to avoid the separate response, you can add this to the server's request handler:

coap_pdu_set_code(response, COAP_RESPONSE_CODE_CONTENT);

As discussed here, retransmitted requests should actually be answered from the cache. However, this is impractical and insecure on low-power devices, which was one motivation behind our OSCORE-NG. The source code is there.

mrdeep1 commented 2 months ago

@kkrentz Thanks for this feedback. In the short term, I will look at caching the response until when the next sequence no comes in or time it out after a suitable time.

As an FYI for your OSCORE-NG code, with the introduction of #1488, the COAP_ #defines in cmake_coap_config.h.in have been moved to cmake_coap_defines.h.in.