how to implement an async client?

Grabber commented 3 years ago

First of all, thank you so much for CoAP protocol and implementation. I didn't know anything about CoAP since 3 days ago, so I'm still reading the code base and experimenting with.

I'm trying to implement an async client in C and use the go-coap as the base for the server side. The client basic requirements are:

send a POST request containing some data (raw JSON or msgpack encoded) to the server without blocking;
receive the response from server and if successful delete the sent data from a local persistent layer - the local data will be deleted only when the server side has persisted it to another persistent layer;
keep retrying until data is delivered;
it must be resilient to any connectivity disconnection or server unavailability (auto-retry connectivity);
run CoAP over TCP, since most of the 3/4G providers are nowadays blocking UDP connections from the client to somewhere (at least in Brazil);

Is it possible to have two threads, one to send and the other to receive the responses at the client side?

I tried to call coap_send() consecutively, but it wen't wrong. Can I use the same context and session and call coap_send() infinite times? Do I need to create a conext/session per coap_send() call?

How can I monitor the UDP/TCP socket states to detect disconnections and be able to force a reconnection? Does it works like ZeroMQ or NNG where you don't have to care about reconnections directly (yes, sometimes you must force-reconnect, but reconnections are usually automatic)?

As soon my implementation advances and works, I plan to publish it as an example for the great libcoap!

Thank you so much for supporting!

mrdeep1 commented 3 years ago

send a POST request containing some data (raw JSON or msgpack encoded) to the server without blocking;

By without blocking, are you referring to NON Confirmable or CONfirmable PDU types? However, if you are planning on TCP, then this question is not relevant as they are not used for TCP.

receive the response from server and if successful delete the sent data from a local persistent layer - the local data will be deleted only when the server side has persisted it to another persistent layer;

OK.

keep retrying until data is delivered;

No need to do this using TCP as data is guarenteed to arrive unless the TCP layer session fails.

it must be resilient to any connectivity disconnection or server unavailability (auto-retry connectivity);

libcoap supports events through the event handler that indicates to the application that there are issues with the connection and that there has been a disconnection of some sort.

run CoAP over TCP, since most of the 3/4G providers are nowadays blocking UDP connections from the client to somewhere (at least in Brazil);

See previous comments.

Is it possible to have two threads, one to send and the other to receive the responses at the client side?

Multi-threading support is not available - a future TODO with no timescales.

I tried to call coap_send() consecutively, but it wen't wrong.

OK

Can I use the same context and session and call coap_send() infinite times?

Yes - see outstanding PR #701 for how to do this. You need to call coap_io_process() so that the responses hare handled.

Do I need to create a conext/session per coap_send() call?

No. Use the same session when talking to the same host. See #701.

How can I monitor the UDP/TCP socket states to detect disconnections and be able to force a reconnection? Does it works like ZeroMQ or NNG where you don't have to care about reconnections directly (yes, sometimes you must force-reconnect, but reconnections are usually automatic)?

As mentioned previously, you need to set up the event handler and monitor the disconnect type events - see man page coap_handler(3). Then create another client session to continue with the traffic (see coap_endpoint_client(3)).

Grabber commented 3 years ago

send a POST request containing some data (raw JSON or msgpack encoded) to the server without blocking;

By without blocking, are you referring to NON Confirmable or CONfirmable PDU types? However, if you are planning on TCP, then this question is not relevant as they are not used for TCP.

As confirmable I currently understand a message that has a deliver confirmation response, while a non-confirmable is the one that doesn't have it. I understand the differences between TCP and UDP, but what is new to me is why confirmable and non-confirmable PDU types are not applicable for TCP. Why? Am I wrong?

By 'without blocking' I mean, call coap_send() and don't block. But being able to check if it was sent or not in the future.

receive the response from server and if successful delete the sent data from a local persistent layer - the local data will be deleted only when the server side has persisted it to another persistent layer;

OK.

keep retrying until data is delivered;

No need to do this using TCP as data is guarenteed to arrive unless the TCP layer session fails.

If I call coap_send() with a confirmable message, is it 100% that it will be delivered? Like a QoS 1/2 in MQTT?

it must be resilient to any connectivity disconnection or server unavailability (auto-retry connectivity);

libcoap supports events through the event handler that indicates to the application that there are issues with the connection and that there has been a disconnection of some sort.

Got it.

run CoAP over TCP, since most of the 3/4G providers are nowadays blocking UDP connections from the client to somewhere (at least in Brazil);

See previous comments.

Got it.

Is it possible to have two threads, one to send and the other to receive the responses at the client side?

Multi-threading support is not available - a future TODO with no timescales.

Ok.

I tried to call coap_send() consecutively, but it went wrong.

OK

I tried the modified coap-client with -G option to calling coap_io_process(). I think it isn't the most efficient thing to do because I would have to split my data events into chuncks... like, if I have 1000 events to send, I would split into 10 batches of 100 events each... coap_send() 100 times, coap_io_process(), coap_send() more 100 times... until the end. Is it the concept?

Can I use the same context and session and call coap_send() infinite times?

Yes - see outstanding PR #701 for how to do this. You need to call coap_io_process() so that the responses hare handled.

Great.

Do I need to create a conext/session per coap_send() call?

No. Use the same session when talking to the same host. See #701.

Ok.

How can I monitor the UDP/TCP socket states to detect disconnections and be able to force a reconnection? Does it works like ZeroMQ or NNG where you don't have to care about reconnections directly (yes, sometimes you must force-reconnect, but reconnections are usually automatic)?

As mentioned previously, you need to set up the event handler and monitor the disconnect type events - see man page coap_handler(3). Then create another client session to continue with the traffic (see coap_endpoint_client(3)).

I think a good advancement for libcoap or coap community in general is work to make it more user friendly. If you compare libmosquitto basic examples with libcoap, libcoap is more more complex and I felt more examples are missing. Maybe I could help to improve it, while I'm learning about the protocol and getting more comfortable with the code base ;)

obgm commented 3 years ago

Thanks for using libcoap. The library initially has intentionally been designed to give application developers access to the lower layer CoAP protocol. If you want to control certain behavior, it is therefore crucial to have a working knowledge of CoAP and the underlying transport protocols. For more basic usage, several example applications such as the coap-client.c or liboap-minimal exist. Regarding "QoS": CoAP over TCP provides the same quality of service as TCP.: If TCP fails, CoAP over TCP fails as well. CoAP cannot give a guarantee such as MQTT's QoS 1 because you need to have a working layer 4 transport connection with the remote peer (you could use a pub/sub service akin to MQTT if you want to increase the possibility that a payload arrives at a peer even if it is not reachable at the time of sending). The bottom line is: You can use CoAP (e.g. with libcoap) pretty much for everything that MQTT does. However, I prefer CoAP for what CoAP does. If I wanted MQTT, I would use MQTT.

mrdeep1 commented 3 years ago

I understand the differences between TCP and UDP, but what is new to me is why confirmable and non-confirmable PDU types are not applicable for TCP. Why? Am I wrong?

As per https://datatracker.ietf.org/doc/html/rfc8323#section-3.1 (CoAP over TCP)

As a result, both the Type and Message ID fields are no longer required and are
removed from the message format for CoAP over TCP.

So CON / NON / ACK / RST Types are not used because TCP is a reliable protocol.

By 'without blocking' I mean, call coap_send() and don't block.

If the PDU cannot immediately be sent, it will get queued for sending (subject to RAM limitations). Note that when using CON, only 1 CON can be inflight as controlled by NSTART (default of 1) - see https://datatracker.ietf.org/doc/html/rfc7252#section-4.7 . If there is a queing or other failure, then COAP_INVALID_MID is returned.

If I call coap_send() with a confirmable message, is it 100% that it will be delivered?

No. Depending on the Congestion Control parameters it will get resent multiple times until a failure event is generated.

I have 1000 events to send, I would split into 10 batches of 100 events each... coap_send() 100 times, coap_io_process(), coap_send() more 100 times... until the end. Is it the concept?

You can do this if you want - just remember with UDP and CON you will be queing up the 99 events (coap_send() will send the first one) and each call to coap_io_process() will be checking that the previous CON has been acknowleged and only then (as NSTART has dropped below 1) will the the next queued CON be sent by coap_io_process(). With TCP, you are subject to the TCP stack limitations.

Grabber commented 3 years ago

@obgm @mrdeep1

Thank you so much for such a rich and detailed explanation!

About the PDU and queueing:

for (int i = 0; i < 100 && !quit; i++) {
   if (! (pdu = coap_new_request(ctx, session, cmdline_method("post"), &optlist, payload.s, payload.length))) {
      goto finish;
   }

   coap_show_pdu(log_level, pdu);

   coap_log(log_level, "sending CoAP request:\n");
   if (coap_get_log_level() < LOG_DEBUG)
      coap_show_pdu(log_level, pdu);

   coap_send(session, pdu);
   res = coap_io_process(ctx, COAP_IO_NO_WAIT);
   if ( res >= 0 ) {
      fprintf(stdout, "res >= 0\n");
   }

   fprintf(stdout, "coap_io_process.1\n");
}

while(!quit && !coap_can_exit(ctx)) {
   fprintf(stdout, "coap_io_process.2\n");

   res = coap_io_process(ctx, COAP_IO_WAIT);
   if ( res >= 0 ) {
      fprintf(stdout, "res >= 0\n");
   }

   fprintf(stdout, "coap_io_process.3\n");
}

Why in this code snippet I'm loosing the last (or the tail messages) and getting a udp: cannot write response: timeout: retransmission(4) was exhausted at the server side coap-server?

mrdeep1 commented 3 years ago

I think the cuplrit here is coap_can_exit(). This function needs to be properly documented but what it is doing is testing whether there is any outstanding data to transmit (not checking if there is any outstanding recieve) and if so, returns 0. So in your case, it is causing the coap-client to exit when everything has been transmitted, even though the coap-server has not finished transmitting back.

Removing && !coap_can_exit(ctx) will confirm this (it makes sense to count all the responses back in in the response handler (?message_handler()?) and then setting the quit flag when all is done).

Grabber commented 3 years ago

I think the cuplrit here is coap_can_exit(). This function needs to be properly documented but what it is doing is testing whether there is any outstanding data to transmit (not checking if there is any outstanding recieve) and if so, returns 0. So in your case, it is causing the coap-client to exit when everything has been transmitted, even though the coap-server has not finished transmitting back.

Removing && !coap_can_exit(ctx) will confirm this (it makes sense to count all the responses back in in the response handler (?message_handler()?) and then setting the quit flag when all is done).

Is there any way to check how many requests has been enqueued? To exit as soon as there is no more responses, maybe I will have to keep track of the sent requests and check how many responses went back.

mrdeep1 commented 3 years ago

Is there any way to check how many requests has been enqueued?

No. You need to keep a tally of what was sent and what has been responded to. (Each request should have an unique token and the response's token is used to match the appropriate request).

Grabber commented 3 years ago

Wait for wait_ms (in the case any response is lost), try every 250ms (ajustable) and exit as soon as req_rep is zero (all sent requests got a response).

static coap_response_t
response_handler(coap_session_t *session COAP_UNUSED,
                const coap_pdu_t *sent,
                const coap_pdu_t *received,
                const coap_mid_t id COAP_UNUSED) {

   fprintf(stdout, "response_handler\n");

   if (received != NULL) {
      fprintf(stdout, "response_handler, pdu, received\n");
      coap_show_pdu(LOG_INFO, received);

      req_rep--;
   }

   return COAP_RESPONSE_OK;
}

for (int i = 0; i < 1000 && !quit; i++) {
   if (! (pdu = coap_new_request(ctx, session, cmdline_method("post"), &optlist, payload.s, payload.length))) {
      goto finish;
   }

   coap_show_pdu(log_level, pdu);

   coap_log(log_level, "sending CoAP request:\n");
   if (coap_get_log_level() < LOG_DEBUG)
      coap_show_pdu(log_level, pdu);

   coap_mid_t sent = coap_send(session, pdu);
   if (sent != COAP_INVALID_MID) {
      req_rep++;
   }

   res = coap_io_process(ctx, COAP_IO_NO_WAIT);
   if (res >= 0) {
      fprintf(stdout, "res >= 0\n");
   }
}

unsigned int wait_seconds = 5;
unsigned int wait_ms = wait_seconds * 1000;

do {
   res = coap_io_process(ctx, 250);

    if (res >= 0) {
      if (wait_ms > 0) {
         if ((unsigned int) res >= wait_ms) {
            coap_log(LOG_INFO, "timeout\n");
            break;
         }

         wait_ms -= res;
      }
   }

   if (req_rep == 0) break;
} while (!quit);

fprintf(stdout, "wait_ms=%d\n", wait_ms);
fprintf(stdout, "req_rep=%d\n", req_rep);

It seems to work!

mrdeep1 commented 3 years ago

Excellent!

Grabber commented 3 years ago

@mrdeep1

Sorry about asking so many things, still learning. Why do you think CoAP has no "enterprise" servers like on MQTT (emq, mosquito, etc)?

I think a huge boost for CoAP would be to look at it more as a final platform, instead of as a specification and base implementation only. What do you think about it?

I think the request/response model fits very well when you have to ensure that data has been transmitted from client to server, while only deleting data from client if it's persisted on server (like on Kafka). This is a pattern where having a broker + sub publishing to Kafka is unreliable, if the sub goes down you will not persist messages anymore.

CoAP fits very well for end-to-end persistence!

mrdeep1 commented 3 years ago

I am aware of (enterprise level) bespoke applications that are using CoAP as a transport layer which is provided for by libcoap. These bespoke applications are not a generic enterprise server with lots of configuration options defining the required functionality that I think you are looking for. That said, there is no reason as to why someone could not do this as a project.

As obgm mentioned previously the roots of libcoap came from different intentions, but the product has evolved making the licoap layer more robust as well as internally handling some specific CoAP functionality should the application layer not want to handle things - e.g. RFC7959 block handling.

boaks commented 3 years ago

In the first comment it was said, that CoAP over TCP is mandatory. If that wouldn't be the case, at least from the performance, scalability and availability perspective, Eclipse/Californium does the job for many larger server-sides.

The upcoming release 3.0 provides in combination with DTLS-CID:

mid term DTLS associations even with sleeping devices (DTLS CID decouples from the ip-address and NATing) reducing handshakes to a minimum (e.g. once a week).
scaling with dtls-cid-built-in-load-balancer, supported on k8s (DTLS CID is used to forward the message to the right pod (DTLS endpoint)). Scale-up as easy as kubectl scale ... --replicas=n.
graceful restart, supported on k8s (requires DTLS CID as well, in order to process the records on the update corresponding pod).

If once up on a day the providers in Brazil support UDP, you may try it out.

obgm / libcoap

how to implement an async client? #738