jiejieTop / mqttclient

A high-performance, high-stability, cross-platform MQTT client, developed based on the socket API, can be used on embedded devices (FreeRTOS / LiteOS / RT-Thread / TencentOS tiny), Linux, Windows, Mac, with a very concise The API interface realizes the quality of service of QOS2 with very few resources, and seamlessly connects the mbedtls encryption library.
Apache License 2.0
705 stars 257 forks source link

FreeRTOS Platform的几个问题向作者反馈 #16

Open lchnu opened 4 years ago

lchnu commented 4 years ago

1、platform_thread_init函数中,xTaskCreate的最后一个参数应该是&thread->thread,而不是V1.1.0版本中的thread->thread。 platform_thread_init函数的自测代码修改为:

err =  xTaskCreate(entry, name, stack_size, param, priority, &thread->thread);

2、platform_thread_destroy函数中,先将任务删除,然后再释放内存的逻辑无法顺利运行。当任务删除后,无法运行到vTaskDelete的后一条释放语句。 platform_thread_destroy函数的自测代码修改为:

TaskHandle_t threadToBeDeleted;

threadToBeDeleted = thread->thread;
platform_memory_free(thread);

if (NULL != threadToBeDeleted)
    vTaskDelete(threadToBeDeleted);

3、我现在使用DTU在建立TCP连接、断开TCP连接都会给用户任务发送事件,用户任务超时5秒等到事件发生,当收到“离线到在线”事件连接 MQTT Server,收到"在线到离线事件" Disconnect, 什么都没有收到的情况下就超时了,即每5秒向MQTT Server发送一次Qos2消息。

在Qos2服务质量下,假设Socket无需用户维护,远程Socket首先断开连接,用户任务检测到掉线,在用户任务中调用mqtt_disconnect函数,会使得内部state变成CLIENT_STATE_CLEAN_SESSION,mqtt_yield返回MQTT_CLEAN_SESSION_ERROR,从而调用mqtt_clean_session,state变成CLIENT_STATE_INVALID状态,最终跳转到platform_thread_destroy函数。platform_thread_destroy函数删除了yield任务,但是c->mqtt_thread没有清零,当Socket再次连接成功后,调用的yield进程没有被重新创建,无法正常运行。

mqtt_yield_thread函数的exit部分自测代码修改为:

threadToBeDestoried = c->mqtt_thread;
c->mqtt_thread = (platform_thread_t *)0;
platform_thread_destroy(threadToBeDestoried);

目前运行较好,向大神请教,不知道我上述理解和修改可对。感谢指点, ^_^

lchnu commented 4 years ago

另外,请问作者,在调用 mqtt_clean_session 函数的最后一行,将状态设置为CLIENT_STATE_INVALID之后,是否应该将c->mqtt_ack_handler_number = 0; 清零? 因为此时的 ack链表都已经全部清除了,没有release的情况下,多次连接会导致 mqtt_ack_handler_number越来越大,最终超过最大值。

感谢您的关注和解答,多谢!

jiejieTop commented 4 years ago

1、platform_thread_init函数中,xTaskCreate的最后一个参数应该是&thread->thread,而不是V1.1.0版本中的thread->thread。 platform_thread_init函数的自测代码修改为:

err =  xTaskCreate(entry, name, stack_size, param, priority, &thread->thread);

2、platform_thread_destroy函数中,先将任务删除,然后再释放内存的逻辑无法顺利运行。当任务删除后,无法运行到vTaskDelete的后一条释放语句。 platform_thread_destroy函数的自测代码修改为:

TaskHandle_t threadToBeDeleted;

threadToBeDeleted = thread->thread;
platform_memory_free(thread);

if (NULL != threadToBeDeleted)
    vTaskDelete(threadToBeDeleted);

3、我现在使用DTU在建立TCP连接、断开TCP连接都会给用户任务发送事件,用户任务超时5秒等到事件发生,当收到“离线到在线”事件连接 MQTT Server,收到"在线到离线事件" Disconnect, 什么都没有收到的情况下就超时了,即每5秒向MQTT Server发送一次Qos2消息。

在Qos2服务质量下,假设Socket无需用户维护,远程Socket首先断开连接,用户任务检测到掉线,在用户任务中调用mqtt_disconnect函数,会使得内部state变成CLIENT_STATE_CLEAN_SESSION,mqtt_yield返回MQTT_CLEAN_SESSION_ERROR,从而调用mqtt_clean_session,state变成CLIENT_STATE_INVALID状态,最终跳转到platform_thread_destroy函数。platform_thread_destroy函数删除了yield任务,但是c->mqtt_thread没有清零,当Socket再次连接成功后,调用的yield进程没有被重新创建,无法正常运行。

mqtt_yield_thread函数的exit部分自测代码修改为:

threadToBeDestoried = c->mqtt_thread;
c->mqtt_thread = (platform_thread_t *)0;
platform_thread_destroy(threadToBeDestoried);

目前运行较好,向大神请教,不知道我上述理解和修改可对。感谢指点, ^_^

另外,请问作者,在调用 mqtt_clean_session 函数的最后一行,将状态设置为CLIENT_STATE_INVALID之后,是否应该将c->mqtt_ack_handler_number = 0; 清零? 因为此时的 ack链表都已经全部清除了,没有release的情况下,多次连接会导致 mqtt_ack_handler_number越来越大,最终超过最大值。

感谢您的关注和解答,多谢!

是的,你说的很对,目前两个bug均已修复,感谢你的issue.

lchnu commented 4 years ago

是的,你说的很对,目前两个bug均已修复,感谢你的issue.

谢谢大神的回复,感谢您的代码!

  1. 我仔细阅读过您的代码,在mqtt_yield_thread函数的exit处,您是destroy thread再清空mqtt_thread,我认为这样不妥当(也在电路上实测过)。destroy之后,任务代码再也无法运行,我认为应该清空再destroy。

您修改的代码:

static void mqtt_yield_thread(void *arg)

exit:
    platform_thread_destroy(c->mqtt_thread);
    c->mqtt_thread = NULL;
}

我修改的代码如下:

static void mqtt_yield_thread(void *arg)

exit:
    threadToBeDestoried = c->mqtt_thread;
    c->mqtt_thread = (platform_thread_t *)0;
    platform_thread_destroy(threadToBeDestoried );
}
  1. 同理,在platform_thread.c中,您在platform_thread_destroy函数实现中,也是先删除任务再free内存。我认为这部分也应该修改。昨晚在我的32F407板子上修改运行通过了。
lchnu commented 4 years ago

@jiejieTop 关于mqtt_ack_handler_destroy函数的问题:

  1. client在socket成功建立连接后,假设Heap区有8000字节空闲。调用mqtt_connect、mqtt_subscribe,并在mqtt_subscribe中注册了一个topic handler;
  2. 调试过程中,broker只向client发送CONNACK,而不发送SUBACK;
  3. 经过预设timeout,mqtt_yield调用mqtt_ack_list_scan(c, 1),转到mqtt_ack_handler_destroy(ack_handler)
  4. 主动关闭socket,mqtt client进行clean session,删除yield任务,清空mqtt_ack_handler_list和mqtt_msg_handler_list
  5. 观察到,heap区的剩余只有7968字节。
  6. 反复进行该步骤,Heap区每次剩余字节都逐渐减少32字节。

感谢您提供的代码,工作很稳定。我的测试过程,在broker正常的情况下不会发生,但我认为毕竟是一个潜在的bug,向大神请教,谢谢您!

lchnu commented 4 years ago

经过仔细思考,我最终修改了mqtt_ack_list_scan函数,添加了删除msg handler的代码。

在Suback、unsuback等命令超时的情况下,会删除对应的msg handler,避免heap区的空间越来越少的bug。不知道 @jiejieTop 您如何看?修改的思路可对?

        /* if it is not a qos1 or qos2 message, it will be destroyed in every processing */
        if( (NULL != ack_handler->handler) && (flag == 1)) /*@lchnu, 2020-10-08, destory handler memory, if suback/unsuback is overdue!*/
        {
          mqtt_msg_handler_destory(ack_handler->handler);
        }
        mqtt_ack_handler_destroy(ack_handler);
        mqtt_subtract_ack_handler_num(c); /*@lchnu, 2020-10-08 */
lchnu commented 4 years ago

补充说明,在mqtt_clean_session函数处,我在 release all ack部分,也添加了msg handler的部分:

避免了socket连接,收到connack,但socket立即就断开时,heap区不断减小的问题。这种情况在使用流量卡的时候,可能会发生。

    /* release all ack_handler_list memory */
    if (!(mqtt_list_is_empty(&c->mqtt_ack_handler_list))) {
        LIST_FOR_EACH_SAFE(curr, next, &c->mqtt_ack_handler_list) {
            ack_handler = LIST_ENTRY(curr, ack_handlers_t, list);
            mqtt_list_del(&ack_handler->list);
            if(NULL != ack_handler->handler) //@lchnu, 2020-10-08, avoid socket disconnet when waiting for suback/unsuback....
            {
              mqtt_msg_handler_destory(ack_handler->handler);
            }
            platform_memory_free(ack_handler);
        }
        mqtt_list_del_init(&c->mqtt_ack_handler_list);
    }
jiejieTop commented 4 years ago

经过仔细思考,我最终修改了mqtt_ack_list_scan函数,添加了删除msg handler的代码。

在Suback、unsuback等命令超时的情况下,会删除对应的msg handler,避免heap区的空间越来越少的bug。不知道 @jiejieTop 您如何看?修改的思路可对?

        /* if it is not a qos1 or qos2 message, it will be destroyed in every processing */
        if( (NULL != ack_handler->handler) && (flag == 1)) /*@lchnu, 2020-10-08, destory handler memory, if suback/unsuback is overdue!*/
        {
          mqtt_msg_handler_destory(ack_handler->handler);
        }
        mqtt_ack_handler_destroy(ack_handler);
        mqtt_subtract_ack_handler_num(c); /*@lchnu, 2020-10-08 */

我明白你的意思,这种情况确实会可能出现,可以这样子改动,但是我做多了一次判断,避免误删:

ps:线程退出那部分我也修改完毕了,详见:https://github.com/jiejieTop/mqttclient/commit/1b81bc2bac566ed61af0334a500bbe782bdccff0

static void mqtt_ack_list_scan(mqtt_client_t* c, uint8_t flag)
{
    mqtt_list_t *curr, *next;
    ack_handlers_t *ack_handler;

    if ((mqtt_list_is_empty(&c->mqtt_ack_handler_list)) || (CLIENT_STATE_CONNECTED != mqtt_get_client_state(c)))
        return;

    LIST_FOR_EACH_SAFE(curr, next, &c->mqtt_ack_handler_list) {
        ack_handler = LIST_ENTRY(curr, ack_handlers_t, list);

        if ((!platform_timer_is_expired(&ack_handler->timer)) && (flag == 1))
            continue;

        if ((ack_handler->type ==  PUBACK) || (ack_handler->type ==  PUBREC) || (ack_handler->type ==  PUBREL) || (ack_handler->type ==  PUBCOMP)) {

            /* timeout has occurred. for qos1 and qos2 packets, you need to resend them. */
            mqtt_ack_handler_resend(c, ack_handler);
            continue;
        } else if ((ack_handler->type == SUBACK) || (ack_handler->type == UNSUBACK)) {

            /*@lchnu, 2020-10-08, destory handler memory, if suback/unsuback is overdue!*/
            if (NULL != ack_handler->handler) {
                mqtt_msg_handler_destory(ack_handler->handler);
            }
        }
        /* if it is not a qos1 or qos2 message, it will be destroyed in every processing */
        mqtt_ack_handler_destroy(ack_handler);
        mqtt_subtract_ack_handler_num(c); /*@lchnu, 2020-10-08 */
    }
}
jiejieTop commented 4 years ago
/* release all ack_handler_list memory */
if (!(mqtt_list_is_empty(&c->mqtt_ack_handler_list))) {
    LIST_FOR_EACH_SAFE(curr, next, &c->mqtt_ack_handler_list) {
        ack_handler = LIST_ENTRY(curr, ack_handlers_t, list);
        mqtt_list_del(&ack_handler->list);
        if(NULL != ack_handler->handler) //@lchnu, 2020-10-08, avoid socket disconnet when waiting for suback/unsuback....
        {
          mqtt_msg_handler_destory(ack_handler->handler);
        }
        platform_memory_free(ack_handler);
    }
    mqtt_list_del_init(&c->mqtt_ack_handler_list);
}

这是OK的~

lchnu commented 4 years ago

Jie神,你好!

我在调试过程中,直接将FreeRTOS的xTickCount初始值,在vTaskStartScheduler函数中设置为了0xFFFF0000UL。

salof使用%u,显示不正常。

SALOF_PRINT_LOG("[TS: %u] [TAR: %s] ",salof_get_tick(), salof_get_task_name())

为了处理这个问题,我修改了_salof_format_int函数中的临时变量n,将其从int变成long long了。

因为你的代码中是判断n的正负,对于unsigned int而言,0xffff0000/10, 一次处理就不满足while循环条件了:

static void _salof_format_int( ......

long long n = num; //@lchnu, int to long long, 2020-10-15
.......

    do {
        nbuf[i++] = digits[n % base];
        n = n / base;
    } while (n > 0);
lchnu commented 4 years ago

另,为了避免timer在溢出的时候,实际expire时间小于timeout,我用freertos的几个接口函数改写了timer部分。 再次感谢Jie神的工作,太赞了!!!!

//platform_timer.h
#include "task.h"

typedef struct platform_timer {
    TickType_t xTicksToWait;
    TimeOut_t xTimeOut;  
} platform_timer_t;

//platform_timer.c
void platform_timer_init(platform_timer_t* timer)
{
    timer->xTicksToWait = 0;
    timer->xTimeOut.xOverflowCount = 0;
        timer->xTimeOut.xTimeOnEntering = 0;
}

void platform_timer_cutdown(platform_timer_t* timer, unsigned int timeout)
{
    timer->xTicksToWait = timeout / portTICK_PERIOD_MS; /* convert milliseconds to ticks */
    vTaskSetTimeOutState(&timer->xTimeOut); /* Record the time at which this function was entered. */  
}

char platform_timer_is_expired(platform_timer_t* timer)
{
        return xTaskCheckForTimeOut(&timer->xTimeOut, &timer->xTicksToWait) == pdTRUE;
}

int platform_timer_remain(platform_timer_t* timer)
{
    xTaskCheckForTimeOut(&timer->xTimeOut, &timer->xTicksToWait); /* updates xTicksToWait to the number left */
    return (timer->xTicksToWait <= 0) ? 0 : (timer->xTicksToWait * portTICK_PERIOD_MS);  
}