open62541 / open62541

Open source implementation of OPC UA (OPC Unified Architecture) aka IEC 62541 licensed under Mozilla Public License v2.0
http://open62541.org
Mozilla Public License 2.0
2.63k stars 1.25k forks source link

mqtt_ping() - call to MQTT_PAL_MUTEX_LOCK gets stuck #4646

Open jackybek opened 3 years ago

jackybek commented 3 years ago

Description

Background Information / Reproduction Steps

From what i observe the sequence in the program is as follows:

  1. call mqtt_init() 1a. within mqtt_init() it calls MQTT_PAL_MUTEX_INIT() and MQTT_PAL_MUTEX_LOCK()

  2. call mqtt_connect() 2a. within mqtt_connect() it calls MQTT_PAL_MUTEX_UNLOCK() at the end of this function

  3. call mqtt_send(), 3a within mqtt_send(), it makes a call mqtt_ping()

  4. In mqtt_ping, i realise the program got stuck when MQTT_PAL_MUTEX_LOCK(&client->mutex) is called.

My assumption of this happening could be due to step 1a. step 1a is dependent on step 2a to unlock.. if this is not done, the situation in step 4 will occur.

To test my assumption, i remove MQTT_PAL_MUTEX_LOCK (step 4) and vola, the mqtt_ping() is successfully called. However, it only gets called once. How can i made the program call mqtt_ping() on a periodic basis so that the communications with the MQTT_Server will not be broken and resulted in the following error:

mqtt_sync return value : 0x80000011 [2021-09-15 11:45:23.093 (UTC+0800)] error/server open62541.c : yieldMqtt() 214841 (/plugins/ua_network_pubsub_mqtt.c) : error: MQTT_ERROR_SOCKET_ERROR

Can someone in the development team takes a close look at these logic?

Used CMake options:

cmake -DUA_NAMESPACE_ZERO=<YOUR_OPTION> <ANY_OTHER_OPTIONS> ..

Checklist

Please provide the following information:

jackybek commented 3 years ago

i made 2 more changes in case SSL_ERROR_ZERO_RETURN as follows :

  mqtt_pal_recvall()
 {
            switch (error) {
            case SSL_ERROR_SSL: {printf("error returned from SSL_get_error() is SSL_ERROR_SSL\n"); return MQTT_ERROR_SOCKET_ERROR;}

            case SSL_ERROR_WANT_READ:
            case SSL_ERROR_WANT_WRITE:
            case SSL_ERROR_WANT_ACCEPT:
            case SSL_ERROR_WANT_CONNECT:
            case SSL_ERROR_WANT_X509_LOOKUP:
            **case SSL_ERROR_ZERO_RETURN:**
                           return read;
 }

 mqtt_pal_sendall()
 {
            switch (error) {
            case SSL_ERROR_WANT_READ:
            case SSL_ERROR_WANT_WRITE:
            case SSL_ERROR_WANT_ACCEPT:
            case SSL_ERROR_WANT_CONNECT:
            case SSL_ERROR_WANT_X509_LOOKUP:
            **case SSL_ERROR_ZERO_RETURN:**
                                return written;
 }

With the above changes, there is no more SSL_SOCKET_ERROR reported.. and the program runs until it hit another error : [2021-09-15 22:06:57.683 (UTC+0800)] error/server open62541.c : yieldMqtt() 214841 (/plugins/ua_network_pubsub_mqtt.c) : error: MQTT_ERROR_SEND_BUFFER_IS_FULL