espressif / esp-mqtt

ESP32 mqtt component
Apache License 2.0
603 stars 255 forks source link

Mqtt getting disconnected frequently and not able to send data for 15 minutes before throwing error. (IDFGH-4370) #181

Closed kannannair084 closed 3 years ago

kannannair084 commented 3 years ago

@david-cermak

Environment Development Kit: Internal Prototype Module or chip used: |ESP32-WROOM-32D IDF version (run git describe --tags to find it): v4.2-beta1-227-gf0e87c933-dirty Build System: [idf.py] Compiler version : xtensa-esp32-elf-gcc (crosstool-NG esp-2020r3) 8.4.0 Operating System: [Windows| (Windows only) environment type: [ESP Command Prompt| Using an IDE?: [No] Power Supply: [external 3.3V|

I am using BG96 - PPP for internet connection and MQTT to connect to Azure iot hub.

Problem description: 1) Frequently disconnecting from server, with following error ( more details in log below) TRANS_SSL: Poll timeout or error, errno=Success, fd=54, timeout_ms=10000

2) I am sending data up with Qos 0, so i am not checking for acknowldgement, when a disconnect happens, i am still able to send the data for next 15 minutes, which obviously doesn't reach iothub, and does not notify me of a connection lost till 15 minutes , when i attempt to write 15th minute it gives me the error mentioned above.

I (530106048) NewEsp: Sending IoT 4G Mesage. Size: 956 inside uplink sending task I (530106058) NewEsp: Sending IoT message completed values = 1, counter = 1I (530106068) NewEsp: back in main core I (530106068) NewEsp: exiting Lte W (530116068) TRANS_SSL: Poll timeout or error, errno=Success, fd=54, timeout_ms=10000 I (530116068) HTTP_CLIENT: MQTT_EVENT_ERROR E (530116068) MQTT_CLIENT: Error write data or timeout, written len = 0, errno=0 I (530116078) HTTP_CLIENT: MQTT_EVENT_DISCONNECTED W (530116088) MQTT_CLIENT: Publish: Losing qos0 data when client not connected task result = -1 I (530116088) NewEsp: Failed to send Telemetry this minute going to end task

kannannair084 commented 3 years ago

@david-cermak could you help taking a look at this issue, this has become a blocker for our project.

david-cermak commented 3 years ago

@kannannair084 Sorry for the late reply.

it should notify about a disconnect correct?

Yes, it should! and looking at the code and the posted logs:

TRANS_SSL: Poll timeout or error, errno=Success, fd=54, timeout_ms=10000

After this message the client should immediately post an event about disconnection (error event before that) the same way as it's in your log (but you're saying the event comes 15minutes after the initial trouble?) The only thing that might play a role here I think is the event handler, could you please share some fractions of your event handling code? I see that you might share some portions of handling code with the HTTP_CLIENT based on the log...?

Also, could you please share debug logs (setting the log level to debug in menuconfig), mainly around the time when the first issue appear?

Any problem related to PPP connection lost or modem issues in the log? (ringbuffer overflow, or similar?)

I am sending data up with Qos 0 ... when a disconnect happens, i am still able to send the data for next 15 minutes

So the network gets disconnected and the MQTT_CLIENT still thinks it's connected? This is probably how sockets work, you might be able to write/read from i, getting the Poll timeout or error, errno as you mentioned earlier, which you're apparently getting but they don't cause the client to disconnect for some reason.

david-cermak commented 3 years ago

@kannannair084 Any update on this issue?

Closing now, please feel free to reopen with more details (on the questions above or the steps how we could reproduce the issue)