tuanpmt / esp_mqtt

MQTT client library for ESP8266
http://tuanpm.net/post/esp_mqtt/
MIT License
1.15k stars 401 forks source link

Race condition between publish and keep alive packet #128

Closed asadislam94 closed 11 months ago

asadislam94 commented 7 years ago

Hi, So I found out that if you keep publishing messages at a frequency that is half of the keep-alive time settings then you can have a race condition between the packets(can take some time before occurring). In that case if the keep alive packet is sent first then we don't have a problem however if publish message is sent first then the keep alive function tries to send while espconn is sending the publish packet. In this case the keep alive function will disconnect and reconnect.

The part of keep alive packet that causes it is:

`#ifdef MQTT_SSL_ENABLE result = espconn_secure_send(client->pCon, client->mqtt_state.outbound_message->data, client->mqtt_state.outbound_message->length);

else

MQTT_INFO("TCP: Do not support SSL\r\n");

endif

} else { result = espconn_send(client->pCon, client->mqtt_state.outbound_message->data, client->mqtt_state.outbound_message->length); } client->mqtt_state.outbound_message = NULL; if (ESPCONN_OK == result) { client->keepAliveTick = 0; client->connState = MQTT_DATA; system_os_post(MQTT_TASK_PRIO, 0, (os_param_t)client); } else { client->connState = TCP_RECONNECT_DISCONNECTING; system_os_post(MQTT_TASK_PRIO, 0, (os_param_t)client); }`

I have found a way around this by adding following to the start of the function: if(client->pCon->state==4) { system_os_post(MQTT_TASK_PRIO, 0, (os_param_t)client); return; } This will cause task to keep retrying sending.

someburner commented 7 years ago

What values are you using for the keepalive & publish frequency? I'd like to see if I can reproduce this issue.

Thanks

asadislam94 commented 7 years ago

Initially I was using 120sec as the keep alive time and 60sec as publish timer. In this scenario, it might take sometime before an error occurs. So I started using 30sec as keep alive and 15sec as publish timer.

Note: The error only occurs when keepalive packet is sent while communication for publish is going on.

ghost commented 7 years ago

I experienced the same, I added your patch, thanks for the good work.