espressif / esp-mqtt

ESP32 mqtt component
Apache License 2.0
603 stars 255 forks source link

MQTT client does not auto-reconnect when server goes down and comes back again (IDFGH-1941) #133

Closed 9lash closed 4 years ago

9lash commented 5 years ago

Hardware: ESP32 Library: esp-mqtt

Configuration of MQTT client `

static void mqtt_app_start(void)
{

esp_mqtt_client_config_t mqtt_cfg = {};
mqtt_cfg.uri = "websitethatworks.com";
mqtt_cfg.event_handle = mqtt_event_handler;
mqtt_cfg.username =  "sample_username";
mqtt_cfg.password = "sample_pass";
mqtt_cfg.client_id = id;
mqtt_cfg.disable_auto_reconnect = false;
ESP_LOGI(TAG, "[APP] Free memory: %d bytes", esp_get_free_heap_size());
esp_mqtt_client_handle_t client = esp_mqtt_client_init(&mqtt_cfg);
esp_mqtt_client_start(client);

}

`

Test:

  1. Configured the wifi and let the MQTT client connect to the server. The client posts the data to the server consistently. The connection is stable for days.
  2. When I turn off the WiFi and then turn on the WiFi after a few mins, the ESP32 waits till wifi available and immediately reconnects when it gets access to wifi but it fails to reconnect to MQTT server.

Output on serial: When there is disconnect in WiFi, I see MQTT_EVENT_DISCONNECTED and then whenever the ESP32 retries to publish a message in this state, I see the message "MQTT has not connected".

To debug this, I added a print statement inside the static void esp_mqtt_task(void *pv) in the mqtt_client.c file.

case MQTT_STATE_CONNECTED: // receive and process data if (mqtt_process_receive(client) == ESP_FAIL) { ESP_LOGE(TAG, "Calling Abort from MQTT_STATE_CONNECTED State - Could not receive/process data"); // added this AI to debug this issue esp_mqtt_abort_connection(client); break; }

As a confirmation after the message: Calling Abort from MQTT_STATE_CONNECTED State - Could not receive/process data I see, MQTT_EVENT_DISCONNECTED. This confirms that when server goes down, the esp_mqtt_abort_connection(client) is called from the above code section. The esp_mqtt_abort_connection(client) closes the transport layer and puts the client->state to MQTT_STATE_TIMEOUT and puts the reconnect time as 10secs.

Following this, I see "Reconnect after 10000 ms" but dont see any activity of reconnection.

On the other side, I have a publishing task which is still running and assumes that MQTT connection is established and tries to publish every 2 secs. And when the esp_mqtt_client_publish is called, it results into MQTT has not connected message.

Is the reconnect feature not implemented in this library? Does anybody else have this issue?

david-cermak commented 4 years ago

Hi @9lash

Yes, reconnection should work the same way as if WiFi turned off and on again (your test no 2). There is actually no difference if broker went down or WiFi disconnects for this library, it calls esp_mqtt_abort_connection() in both cases and keeps retrying connection to the broker (every 10s by default).

Could you please provide versions of the esp-mqtt, esp-idf and debug logs for the issue when reconnection not working?

9lash commented 4 years ago

Hi @david-cermak

Thanks for your reply. Actually I found the issue. I was actually running a keep_publishing_task() which gets started when MQTT_EVENT_CONNECTED state happens. And this task was never deleted when the MQTT_EVENT_DISCONNECTED happens. Since this task never got deleted, it would keep trying to publish even when MQTT was not connected. Thus, the MQTT client has not connected message on the serial.

When I deleted this task when MQTT_EVENT_DISCONNECTED state happened, the esp_mqtt_client task gets a chance to got to MQTT_WAIT_TIMEOUT state and attempts reconnect.