eclipse-threadx / netxduo

Eclipse ThreadX - NetXDuo is an advanced, industrial-grade TCP/IP network stack designed specifically for deeply embedded real-time and IoT applications
https://github.com/eclipse-threadx/rtos-docs/blob/main/rtos-docs/netx-duo/index.md
MIT License
242 stars 137 forks source link

MQTT publish crashes #80

Closed pawelkub closed 2 years ago

pawelkub commented 2 years ago

Im working on the project in which I use ThreadX and NetXduo with mqtt client. I have one thread (M) in which I receive mqtt data and send response for them. In other two threads (X and Y) Im sending telemetry data via mqtt.

I use the same publish mqtt API function where I use mutex to protect access to mqtt client object for sending mqtt data for whole system.

Everything works correctly for a while (for example 50k published messages, but the exachanged messages number are not constant), but after that I get a communication error. One of the thread, i.e. X stucks on forever-loop when publishing mqtt telemetry.

It stucks in function _nx_ip_header_add() on 111 line:

/* Assert prepend pointer is no less than data start pointer.  */
/*lint -e{946} suppress pointer subtraction, since it is necessary. */
NX_ASSERT(packet_ptr -> nx_packet_prepend_ptr >= packet_ptr -> nx_packet_data_start);

After that the IP instance thread stuck, and I got mqtt disconnecting.

image

TiejunMS commented 2 years ago

Hi @pawelkub , could you provide the version number from nx_api.h? It usually looks like 6.1.x. The call stack shows the packet struct is corrupted. Here are quick test you can do.

  1. Increase stack size.
  2. Upgrade AzureRTOS to latest version.

For thread suspends on _nx_ip_header_add, could you provide the data of nx_packet_prepend_ptr, nx_packet_data_start, nx_packet_append_ptr, nx_packet_data_end and nx_packet_length from packet_ptr?

pawelkub commented 2 years ago

Hi @TiejunMS, thanks for leply,

I used versions 6.1.7 of netxduo and threadx. Update to latest 6.1.10 version doesn't help unfortunately.

Should I increase stack of thread which call mqtt publish function (in this case pms_thread) or mqtt client thread or IP instance thread?

packet_ptr data dump: image

UPDATE: I increase stack for pms_thread but still have the same problem (only I think more messages are published). I try also create new thread (app_mqtt_pub_thread) which is responsible for publish data via mqtt client. I modify API app_mqtt_publish function for publish mqtt telemetry, now I copy pointers to topic and payload which should be send to static pointers. And only the new thread is sending data via mqtt, but still the same error occures.

/* publish data */
static const char *send_topic = NULL;
static const char *send_payload = NULL;
static uint16_t send_topic_len = 0;
static uint16_t send_payload_len = 0;
#define PUB_MQTT_FLAG_START     0x01
#define PUB_MQTT_FLAG_DONE      0x02

UINT app_mqtt_publish(const char *topic, uint16_t topic_len,
                      const  char *msg, uint16_t msg_len)
{
    static uint32_t cnt = 0;
    UINT ret;
    ret = tx_mutex_get(&mutex_pub_mqtt, SLEEP_MS_TO_TICKS(100));
    if (TX_SUCCESS != ret)
    {
        LOG_ERROR("mqtt_pub_mutex %u [%u]", ret, cnt);
        return ret;
    }
    send_topic = topic;
    send_topic_len = topic_len;
    send_payload = msg;
    send_payload_len = msg_len;
    tx_event_flags_set(&event_flag_pub_mqtt, PUB_MQTT_FLAG_START, TX_OR);
    ULONG flag;
    tx_event_flags_get(&event_flag_pub_mqtt, PUB_MQTT_FLAG_DONE, TX_OR_CLEAR, &flag, TX_WAIT_FOREVER);
    cnt++;
    tx_mutex_put(&mutex_pub_mqtt);
    return ret;
}

static VOID app_mqtt_pub_thread_entry(ULONG thread_input)
{
    while(1)
    {
        ULONG flag;
        if (tx_event_flags_get(&event_flag_pub_mqtt, PUB_MQTT_FLAG_START,
                               TX_OR_CLEAR, &flag,
                               TX_WAIT_FOREVER) == TX_SUCCESS)
        {
            if (send_topic != NULL && send_payload != NULL
                && send_topic_len > 0 && send_payload_len > 0)
            {
                nxd_mqtt_client_publish(&MqttClient,
                                        send_topic, send_topic_len,
                                        send_payload, send_payload_len,
                                        RETAIN_ENABLE, QOS1, NX_WAIT_FOREVER);
            }
            tx_event_flags_set(&event_flag_pub_mqtt, PUB_MQTT_FLAG_DONE, TX_OR);            
        }
    }
}
TiejunMS commented 2 years ago

The packet is set incorrectly somehow. To figure out what happened, are you aware if X thread stuck first or communication error occurs first?

TiejunMS commented 2 years ago

Close as no response.