espressif / esp-aws-iot

AWS IoT SDK for ESP32 based chipsets
Apache License 2.0
256 stars 153 forks source link

smartDeviceConnectionDisconnectionEvent : "disconnectReason": "DUPLICATE_CLIENTID", (CA-337) #222

Open PaulAnurag opened 1 month ago

PaulAnurag commented 1 month ago

Hello Folks,

I am using the https://github.com/espressif/esp-aws-iot/tree/release/202012.04-LTS branch for my project. In this project, whenever there is a slight disconnection in internet for a couple of seconds, my device goes offline and disconnects for AWS. On checking the smartDeviceConnectionDisconnectionEvent in aws iot core, I can see the below disconnection event : { "clientId": "xxxxxxxxxxxxxxxx", "timestamp": 1716871954622, "eventType": "disconnected", "clientInitiatedDisconnect": false, "sessionIdentifier": "xxxxxxxxxxxxxxxx", "principalIdentifier": "xxxxxxxxxxxxxxxx", "disconnectReason": "DUPLICATE_CLIENTID", "versionNumber": 6 }

I am trying to tune the different Macros such as :

SDK Config Core Mqtt Macro :

  1. CONFIG_MQTT_STATE_ARRAY_MAX_COUNT=10
  2. CONFIG_MQTT_MAX_CONNACK_RECEIVE_RETRY_COUNT=6
  3. CONFIG_MQTT_PINGRESP_TIMEOUT_MS=10000
  4. CONFIG_MQTT_RECV_POLLING_TIMEOUT_MS=15
  5. CONFIG_MQTT_SEND_TIMEOUT_MS=20000

core_mqtt_config_defaults :

  1. PACKET_TX_TIMEOUT_MS = 30000
  2. PACKET_RX_TIMEOUT_MS = 30000

Please suggest me if there is any change required.

avsheth commented 1 month ago

Hi @PaulAnurag Ideally the IoT cloud would have set disconnectReason to CONNECTION_LOST in the lifecycle event, had it been disconnected due to network fluctuations. Suggest to check the reason behind DUPLICATE_CLIENTID from the cloud.

As for network resiliency of the device, device could actively disconnect in one of the two conditions:

  1. Device is idle and keep alive fails MQTT_KEEP_ALIVE_INTERVAL_SECONDS option in the application can determine this. Default for this in the example is 60s. Parameter indicates that it'll check for connectivity every 60seconds and if doesn't hear back from the cloud within CONFIG_MQTT_PINGRESP_TIMEOUT_MS, it disconnects.

  2. Network fluctuations during some ongoing transaction One of the options that can be looked into for this is CONFIG_MQTT_RECV_POLLING_TIMEOUT. If the device is bound to receive some packet/data and network is disconnected, it could detect that and exits from the loop. You may try increasing this to 2000.

However, like I said, the disconnect reason in the lifecycle event should still be different.

PaulAnurag commented 4 weeks ago

Hello @avsheth Thanks for the prompt reply. I will test once keeping the MQTT_KEEP_ALIVE_INTERVAL_SECONDS 60s and increasing the CONFIG_MQTT_RECV_POLLING_TIMEOUT to 2000ms.

Is there any documentation for all the mqtt macro parameters ?

avsheth commented 3 weeks ago

You can read Kconfig parameters in the corresponding Kconfig files or in in the menuconfig window. For application defined macros, you can read about them just above where they are defined. For example, here.

PaulAnurag commented 3 weeks ago

AWS MQTT disconnection and re connection events have decreased a lot after changing the below macros CONFIG_MQTT_PINGRESP_TIMEOUT_MS - 10000 CONFIG_MQTT_RECV_POLLING_TIMEOUT_MS - 5000