Reconnecting subscriber client core dumps if the broker is not present

sbovbjerg commented 4 years ago

This happens when using BIO sockets (encrypted or not) and the broker is not present at startup (or is stopped): Inmqtt_sync(), __mqtt_recv() is called regardless of the state of the connection client->socketfd, causing a segmentation fault due to a call toBIO_should_retry() in mqtt_pal_recvall() with an illegal fd value.

This can be avoided by returning before the call to __mqtt_recv() in mqtt_sync() if the socketfd is 0, e.g:

   /* Added: */
    if (!client->socketfd) {
        client->error = MQTT_ERROR_SOCKET_ERROR;
        return MQTT_ERROR_SOCKET_ERROR;
    } 

    /* Call receive */
    err = __mqtt_recv(client);

This also allows the user to implement logic in the user space sync function that handles the error state, and e.g. changes to a longer sleep interval to avoid trying to reconnect at 100msec intervals (which may spam a log file that outputs connection status and attempts)

LiamBindle commented 4 years ago

Hi @sbovbjerg, thanks for opening this issue. I see what you mean—good point! Would you mind submiting a PR with your proposed change?

sbovbjerg commented 4 years ago

Hi Liam, For some reason I cannot reproduce the issue with the MQTT examples (I created a bio_reconnecting_subscriber for the purpose but "unfortunately" it works just fine).

I my implementation, I use almost identical copies of bio_sockets.h and openssl_sockets.h from examples/templates. However, when I call the open_nb_socket function (and the broker is stopped), I end up in the error handling code after calling BIO_do_connect and get a NULL BIO handle returned, which causes a NULL BIO socketfd to be set in the client, and the subsequent core dump during mqtt_sync.

It seems like I cannot get BIO_do_connect to fail in the example code.... but reconnect_client is called with a client in state MQTT_ERROR_SOCKET_ERROR.

I will try to figure out why it behaves differently and create a pull request if I manage re-create the issue.

Btw, you can force the behavior by returning NULL from open_nb_socket.

Cheers, Søren

sbovbjerg commented 4 years ago

Figure out how to make it "fail" properly.

PR: https://github.com/LiamBindle/MQTT-C/pull/64

LiamBindle / MQTT-C

Reconnecting subscriber client core dumps if the broker is not present #60