vshymanskyy / TinyGSM

A small Arduino library for GSM modules, that just works
GNU Lesser General Public License v3.0
1.91k stars 708 forks source link

MQTT disconnects around 24hrs and doesnt reconnect #723

Closed droidblastnz closed 1 year ago

droidblastnz commented 1 year ago

[x] I have read the Troubleshooting section of the ReadMe

What type of issues is this?

[ ] Request to support a new module

[ ] Bug or problem compiling the library [x] Bug or issue with library functionality (ie, sending data over TCP/IP) [x] Question or request for help

What are you working with?

LILYGO® TTGO T-SIM7600 ESP32 LTE Cat4/1 4G Development Board SIM7600G-H R2 Modem: SIM7600 Main processor board: ESP32-WROVER TinyGSM version: 0.11.5 Code: MQTTClient example Broker is hivemq.com PubSubClient version 2.8.0

Scenario, steps to reproduce

Power on and gets network and connected to MQTT broker.hivemq.com for close to 24hrs Restart on === MQTT NOT CONNECTED === and it runs for another 24hrs

Expected result

To still be connected to broker.hivemq.com

Actual result

Run the base code (MQTTClient example) and SIM for close to 24hrs then === MQTT NOT CONNECTED ===

Debug and AT command log

Enabled debugging

My code is working as expected, except a MQTT disconnect around 24hrs.

Using the example MQTT code I can connect to broker.hivemq.com and blink the LED without issues but after close to 24hrs disconnects and will not reconnect.

Netlight is flashing at 200ms ON/OFF registered on 4G network (+CIPCLOSE: 1,0,0,0,0,0,0,0,0,0)

Up to the disconnect the following is true:

Netlight is still flashing at 200ms ON/OFF registered on 4G network (+CIPCLOSE: 0,0,0,0,0,0,0,0,0,0)

Serial Modem Logs

10:29:56.921 -> AT+CIPRXGET=4,0 10:29:56.921 -> 10:29:56.921 -> +CIPRXGET: 4,0,0 10:29:56.974 -> 10:29:56.974 -> OK 10:29:56.974 -> AT+CIPCLOSE? 10:29:56.974 -> 10:29:56.974 -> +CIPCLOSE: 0,0,0,0,0,0,0,0,0,0 10:29:57.021 -> 10:29:57.021 -> OK 10:29:57.021 -> === MQTT NOT CONNECTED === 10:29:57.122 -> AT+CGREG? 10:29:57.122 -> 10:29:57.122 -> +CGREG: 0,1 10:29:57.157 -> 10:29:57.157 -> OK 10:29:57.157 -> === MQTT NOT CONNECTED === 10:29:57.275 -> AT+CGREG? 10:29:57.275 -> 10:29:57.275 -> +CGREG: 0,1 10:29:57.321 -> 10:29:57.321 -> OK 10:29:57.321 -> === MQTT NOT CONNECTED === 10:29:57.423 -> AT+CGREG? 10:29:57.423 -> 10:29:57.423 -> +CGREG: 0,1 10:29:57.475 -> 10:29:57.475 -> OK

Using the Tiny_GSM library MQTT doesnt reconnect.

Tried to add a delay from 100-10000 and/or TINY_GSM_YIELD_MS 2 no resolve.

if (!mqtt.connected()) { SerialMon.println("MQTT NOT CONNECTED!"); // Reconnect every 10 seconds uint32_t t = millis(); if (t - lastReconnectAttempt > 10000L) { lastReconnectAttempt = t; if (mqttConnect()) { lastReconnectAttempt = 0; } } //delay(10000); //delay for 10 seconds //delay(1000); delay(100); //delay for 100 milliseconds 0.1 of a second //delay(TINY_GSM_YIELD_MS); // Add this line to yield to the TinyGSM library 2ms return; } mqtt.loop(); //Calls mqtt.loop in the pubsubclient library

T-SIM7600G-H works as expected via the uploaded code until MQTT disconnects around 24hrs. Hard to confirm the exact time and logs.

Baud rate for Modem and ESP set to 115200, cannot seem to set to any other speed and have comms to modem.

Power is a powered hub with 2.4amp. External pigtail antenna away from SIM/ESP.

alex9087 commented 1 year ago

I HAVE THE SAME PROBLEM

droidblastnz commented 1 year ago

I HAVE THE SAME PROBLEM

Reading this link https://github.com/knolleary/pubsubclient/issues/239

The actual value of the Keep Alive is application specific; typically this is a few minutes. The maximum value is 18 hours 12 minutes and 15 seconds.

Also added to see how long before the code calls the loop where the pub/subclient is called. SerialMon.print("Current millis value is: "); SerialMon.println(t); mqtt.loop();

So my results so a loop cycle of under one second

Delay below is 100 - 0.1ms

https://github.com/knolleary/pubsubclient/issues/795 The keepalive handling in the client should keep the connection going regardless of the qos level being published at. We have seen some issues with client.loop being called too frequently. Putting it in a tight loop like you have means the hardware spends all of its time checking the connection which gets in the way of proper handling of the data. Have you tried adding a small delay in your loop function - delay(100)? (1ms)

Or #define TINY_GSM_YIELD_MS 2 //was 2 milliseconds in place of the delay

  //delay(10000);                                       //delay for 10 seconds
  //delay(1000);
  delay(100);                                         //delay for 100 milliseconds 0.1 of a second
  //delay(TINY_GSM_YIELD_MS);                           // Add this line to yield to the TinyGSM library 2ms
  return;
}
mqtt.loop();  

https://community.home-assistant.io/t/mqtt-noise-when-starting-and-issues-sending-commands-over-3-4g/554708/9

What are you using device wise and can you confirm its still network connected?

https://pubsubclient.knolleary.net/api#setKeepAlive

Lots of chat about Keepalive for example. Keepalive in the standard documentation is shown as binary 10s, so even though the test went well, can the person above stress test it if they did not, how long did it take for it to actually drop the connection after 15s, was it 16s or 15sx1.5=22.5s?

Revised mine to need to test again 60sx1.5=90s

See here https://github.com/Amit-Agrawal0177/PubSubClient rather than a direct edit to the file, but in saying that https://github.com/knolleary/pubsubclient/issues/726#issuecomment-608136927

If you are using the Arduino IDE then you must edit PubSubClient.h as it says in the docs. This is because the Arduino IDE completely reorganises your #include statements and they get moved to the top of the script that eventually gets compiled. So any #define you add will happen after the library has been included and built.

add mqttClient.setKeepAlive(60) again need to test

`boolean mqttConnect() { SerialMon.print("Connecting to "); SerialMon.print(broker);

// Connect to MQTT Broker boolean status = mqtt.connect("GsmClientTest");

// Set the keep alive interval (in seconds) mqttClient.setKeepAlive(60);

// Or, if you want to authenticate MQTT: // boolean status = mqtt.connect("GsmClientName", "mqtt_user", "mqtt_pass");

if (status == false) { SerialMon.println(" fail"); return false; } SerialMon.println(" success"); mqtt.publish(topicInit, "GsmClientTest started"); mqtt.subscribe(topicLed`

https://github.com/knolleary/pubsubclient/issues/795#issuecomment-743787288

PubSubClient.cpp Line 255: lastInActivity = lastOutActivity = millis(); pingOutstanding = false; //added need to test

Now after all that is said.... https://github.com/knolleary/pubsubclient/issues/689#issuecomment-564303344

If the Keep Alive value is non-zero and the Server does not receive a Control Packet from the Client within one and a half times the Keep Alive time period, it MUST disconnect the Network Connection to the Client as if the network had failed [MQTT-3.1.2-24].

http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/csprd02/mqtt-v3.1.1-csprd02.html#_Ref363645900

To me this is where the issue may lie.

So will test but maybe in the MQTT reconnect maybe this will work

mqtt.disconnect(); // Disconnect from MQTT

Also added AT+CTZU=1 command, modem is synchronized with the network https://github.com/vshymanskyy/TinyGSM/issues/574

//Force Modem to sync time with network //added 09/04/2023 TinyGSM/issues/574 modem.sendAT("+CTZU=1"); //CTZU=1 forcing time syncronization with network, automatic time and time zone update via NITZ modem.waitResponse(1000); Serial.println("CTZU Sync'ed time with network");

droidblastnz commented 1 year ago

Still reading but another hit of mqtt.disconnect(); but after a pub in my case trying disconnect during the reconnect function. https://github.com/knolleary/pubsubclient/issues/435#issuecomment-450768516

while (processed_size < total_recv_size) { ... // try to connect to the topic while (!mqtt.connect(mqtthelper.topic_str)) { SerialMon.println(" === MQTT NOT CONNECTED === "); delay(100); } mqtt.publish(mqtthelper.topic_str, payload, payload_size + TIMESTAMP_BYTES); mqtt.disconnect(); // disconnect explicitly ... }

droidblastnz commented 1 year ago

So the final configuration for the Test 1: TinyGsmClient.h //version 0.11.3 PubSubClient.h //version 2.8.0

PubSubClient.h as per https://pubsubclient.knolleary.net/api#setKeepAlive 15s

Revise in the PubSubClient.h library not as a define due to the way it is compiled.

// MQTT_KEEPALIVE : keepAlive interval in Seconds. Override with setKeepAlive()
#ifndef MQTT_KEEPALIVE
#define MQTT_KEEPALIVE 15
#endif

PubSubClient.cpp _client->stop(); still enabled

boolean PubSubClient::loop() {
    if (connected()) {
        unsigned long t = millis();
        if ((t - lastInActivity > this->keepAlive*1000UL) || (t - lastOutActivity > this->keepAlive*1000UL)) {
            if (pingOutstanding) {
                this->_state = MQTT_CONNECTION_TIMEOUT;
                _client->stop();
                return false;

Tiny_GSM added mqtt.disconnect to mqtt.connected function

http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/csprd02/mqtt-v3.1.1-csprd02.html#_Ref363645900

[MQTT-3.1.2-24]. If the Keep Alive value is non-zero and the Server does not receive a Control Packet from the Client within one and a half times the Keep Alive time period, it MUST disconnect the Network Connection to the Client as if the network had failed

    if (!mqtt.connected()) {
        SerialMon.println("MQTT NOT CONNECTED! ");
        mqtt.disconnect(); // Disconnect from MQTT  //added 08/04/2023
        SerialMon.print("Network State is: ");
        SerialMon.println(modem.isGprsConnected()); //added 08/04/2023
        // Reconnect every 10 seconds
        uint32_t t = millis();
        if (t - lastReconnectAttempt > 10000L) {
            lastReconnectAttempt = t;
            if (mqttConnect()) {
                lastReconnectAttempt = 0;
            }
        }
        delay(100);
        return;
    }

Loop timing 1ms, checked that there were no other delays post mqtt.loop call.

53659927 750 0.750ms
53660677 745 0.745ms
53661431 748 0.748ms
53662179 750 0.750ms
53662929 752 0.752ms
53663681 754 0.754ms
53664435 748 0.748ms
53665183 758 0.758ms
53665941 752 0.752ms

Of note added time sync between modem and network

   modem.sendAT("+CTZU=1"); //CTZU=1 forcing time syncronization with network, automatic time and time zone update via NITZ
   modem.waitResponse(1000);
   Serial.println("CTZU Sync'ed time with network");

Will now try for another 24hrs

alex9087 commented 1 year ago

What was youre result? I think than this is no the answer, i just added de mqtt.disconnect(), before to connect

droidblastnz commented 1 year ago

To me its not MQTT disconnecting its the network.

AT+CGREG?

+CGREG: 0,1

OK AT+CIPRXGET=4,0

+CIPRXGET: 4,0,0

OK AT+CIPCLOSE?

+CIPCLOSE: 0,0,0,0,0,0,0,0,0,0

OK MQTT NOT CONNECTED! AT+CIPSEND=0,2

+CIPERROR: 2

ERROR AT+CIPCLOSE=0

+CIPCLOSE: 0,4

ERROR Network State is: AT+NETOPEN?

+NETOPEN: 0

droidblastnz commented 1 year ago

What was youre result? I think than this is no the answer, i just added de mqtt.disconnect(), before to connect

did this resolve your issue?

droidblastnz commented 1 year ago

The following reconnects if the network is disconnected due to the operator closing the connection.

if (!mqtt.connected()) {
    SerialMon.println("MQTT NOT CONNECTED! ");
    SerialMon.print("Disconnecting from: ");
    SerialMon.println(broker);
    mqtt.disconnect(); // Disconnect from MQTT  //added 08/04/2023
    delay(500);
    SerialMon.print("Network State is: ");
    SerialMon.println(modem.isGprsConnected()); //added 08/04/2023

    // Reconnect every 10 seconds
    uint32_t t = millis();
    if (t - lastReconnectAttempt > 10000L) {
        lastReconnectAttempt = t;
        if (!modem.isGprsConnected()) {
            // Reconnect to GPRS network if not connected
            SerialMon.println("GPRS not connected, reconnecting...");
            if (!modem.gprsConnect(apn, gprsUser, gprsPass)) {
                SerialMon.println("GPRS reconnect failed");
                delay(10000);
                return;
            }
        }

        if (mqttConnect()) {
            lastReconnectAttempt = 0;
        }
    }
    delay(100);
    return;
}

@alex9087 add the modified lines and check. It will still disconnect but connects straight away and removes the MQTT function going into a endless loop.