vshymanskyy / TinyGSM

A small Arduino library for GSM modules, that just works
GNU Lesser General Public License v3.0
1.91k stars 709 forks source link

SIM7000 looses (MQTT) connection every ~14 hours #574

Open I-Connect opened 2 years ago

I-Connect commented 2 years ago

Hi,

We use a SIM7000E with an ESP32. We use pubsubclient for MQTT. And we do a NTP timesync every 24 hours

We run a task that handles the whole process every 200 ms (see below).

After approx 14 hours we always loose the MQTT connection, reconnecting GPRS is successful but then MQTT re-connect fails. Eventually we do a reboot of the SIM7000 (toggeling the power pin) which fixes the issue.

As it happens only once every 14 hours it is a bit difficult to troubleshoot..

Any ideas on what could be going wrong or how best to troubleshoot? What is more likely, is it a SIM7000 issue or MQTT/pubsubclient issue

thx!

void Sim7000Sensor::handelGsmTask() {

  //overall guard, power cycle timer
  if ( (millis() - lastGprsConn) > (MAX_NO_CONN_TIME_SEC * ONESECOND) ) {
    gsmConnectionState = GsmConnectionState::POWER_CYCLING;
    lastGprsConn = millis();
    char buff[20];
    sprintf(buff, "%s", GSM_WARNINGS[0]);
    sendWarning(buff, getId());
    log_e("Max time no connection, powercycling GSM");
  }

  switch (gsmConnectionState)
  {
  case GsmConnectionState::UNKNOWN :
    #ifdef DEBUG_GSM
      log_d("####### GSM status UNKNOWN #######");
    #endif
    gsmConnectionState = GsmConnectionState::POWER_ON;
    break;
  case GsmConnectionState::POWER_ON :
    #ifdef DEBUG_GSM
      log_d("####### GSM status POWER_ON #######");
    #endif
    if (powerOn()) {
      gsmConnectionState = GsmConnectionState::INITIALIZING; 
      char buff[20];
      sprintf(buff, "%s", GSM_SUCCESS[0]);
      sendWarning(buff, getId()); 
    }
    break;
  case GsmConnectionState::POWER_CYCLING :
    #ifdef DEBUG_GSM
      log_d("####### GSM status POWER_CYCLING #######");
    #endif
    if (powerCycle()) {
      gsmConnectionState = GsmConnectionState::INITIALIZING;
      char buff[20];
      sprintf(buff, "%s", GSM_WARNINGS[6]);
      sendWarning(buff, getId());
    } 
    break;
  case GsmConnectionState::INITIALIZING :
    #ifdef DEBUG_GSM
      log_d("####### GSM status INITIALIZING #######");
    #endif
    if (initGsm()) {
      gsmConnectionState = GsmConnectionState::SETTING_NETWORK_MODE;
      char buff[20];
      sprintf(buff, "%s", GSM_SUCCESS[1]);
      sendWarning(buff, getId());
    } 
    break;
  case GsmConnectionState::SETTING_NETWORK_MODE :
    #ifdef DEBUG_GSM
      log_d("####### GSM status SETTING_NETWORK_MODE #######");
    #endif
    if (setNetworkMode(NetworkMode::AUTOMATIC)) {
      gsmConnectionState = GsmConnectionState::CHECK_NETWORK_AVAILABLE;
    }
    break;
  case GsmConnectionState::CHECK_NETWORK_AVAILABLE :
    #ifdef DEBUG_GSM
      log_d("####### GSM status CHECK_NETWORK_AVAILABLE #######");
    #endif
    if (networkAvailable()) {
      gsmConnectionState = GsmConnectionState::CONNECTING_GPRS;
      char buff[20];
      sprintf(buff, "%s", GSM_SUCCESS[6]);
      sendWarning(buff, getId());
    } else {
      gsmConnectionState = GsmConnectionState::WAITING_FOR_NETWORK_AVAILABLE;
      startWaitTimeCheckNetworkMs = millis();
      char buff[20];
      sprintf(buff, "%s", GSM_WARNINGS[5]);
      sendWarning(buff, getId());
    }
    break;
  case GsmConnectionState::WAITING_FOR_NETWORK_AVAILABLE :
    if ( (millis() - startWaitTimeCheckNetworkMs) > (WAIT_TIME_CHECK_NETWORK_SEC * ONESECOND)) {
      #ifdef DEBUG_GSM
        log_d("####### GSM status WAITING_FOR_NETWORK_AVAILABLE #######");
      #endif
      gsmConnectionState = GsmConnectionState::CHECK_NETWORK_AVAILABLE;
    }
    break;
  case GsmConnectionState::CONNECTING_GPRS :
    #ifdef DEBUG_GSM
      log_d("####### GSM status CONNECTING_GPRS #######");
    #endif
    if (connectGPRS()) { 
      char buff[20];
      sprintf(buff, "%s", GSM_SUCCESS[5]);
      sendWarning(buff, getId());
      //sync time once when first connected
      syncNtpUtcTime(true);
      if (mqttEnabled) {
        gsmConnectionState = GsmConnectionState::CONNECTING_MQTT;
      } else {
        gsmConnectionState = GsmConnectionState::CONNECTED;
        #ifdef DEBUG_GSM
          log_d("####### GSM status CONNECTED #######");
        #endif
      }
    } else {
      gsmConnectionState = GsmConnectionState::WAITING_FOR_GPRS_AVAILABLE;
      startWaitTimeCheckGprsMs = millis();
      char buff[20];
      sprintf(buff, "%s", GSM_WARNINGS[4]);
      sendWarning(buff, getId());
    }
    break;
  case GsmConnectionState::WAITING_FOR_GPRS_AVAILABLE :
    if ( (millis() - startWaitTimeCheckGprsMs) > (WAIT_TIME_CHECK_GPRS_SEC * ONESECOND) ) {
      #ifdef DEBUG_GSM
        log_d("####### GSM status WAITING_FOR_GPRS_AVAILABLE #######");
      #endif
      gsmConnectionState = GsmConnectionState::CONNECTING_GPRS;
    }
    break;
  case GsmConnectionState::CONNECTING_MQTT :
    #ifdef DEBUG_GSM
      log_d("####### GSM status CONNECTING_MQTT #######");
    #endif
    if (!connectMQTT()) {
      startWaitTimeCheckMqttMs = millis();
      mqttCallFailed = true;
      char buff[20];
      sprintf(buff, "%s", GSM_WARNINGS[3]);
      sendWarning(buff, getId());
    } else {
      char buff[20];
      sprintf(buff, "%s", GSM_SUCCESS[4]);
      sendWarning(buff, getId());
      mqttCallFailed = false;
    }
    gsmConnectionState = GsmConnectionState::CONNECTED;
    #ifdef DEBUG_GSM
      log_d("####### GSM status CONNECTED #######");
    #endif
    break;
  case GsmConnectionState::CONNECTED : {
    //Handle syncutp time
    byte syncResult = syncNtpUtcTime();
    if (syncResult == -1) {  //sync failed time-out
      #ifdef DEBUG_GSM
        log_w("NTP_SYNC time-out");
      #endif
      char buff[20];
      sprintf(buff, "%s", GSM_WARNINGS[1]);
      sendWarning(buff, getId());
      gsmConnectionState = GsmConnectionState::CHECK_NETWORK_AVAILABLE;
      return;
    }else if (syncResult > 0) { //sync success
      #ifdef DEBUG_GSM
        log_d("NTP_SYNC success");
      #endif
      char buff[20];
      sprintf(buff, "%s", GSM_SUCCESS[2]);
      sendWarning(buff, getId());
      lastGprsConn = millis();
    }

    //handle MQTT
    if (mqttEnabled) {
      if (handleMqttActions()) {
        mqttCallFailed = false;
        lastGprsConn = millis();
      } else {
        if (!mqttCallFailed){
          //set failed timer on 1st failure
          startWaitTimeCheckMqttMs = millis();
        }
        mqttCallFailed = true;
      }

      if ( (millis() - startWaitTimeCheckMqttMs) > (WAIT_TIME_CHECK_MQTT_SEC * ONESECOND) && mqttCallFailed) {
        mqttCallFailed = false;
        mqttFailures++;
        #ifdef DEBUG_GSM
          log_d("MQTT send failed");
        #endif
        mqttClient.disconnect();
        if(!networkAvailable()) {
          gsmConnectionState = GsmConnectionState::CHECK_NETWORK_AVAILABLE;
        } else if(!sim7000.isGprsConnected()) {
          gsmConnectionState = GsmConnectionState::CONNECTING_GPRS;
        } else {
          char buff[20];
          sprintf(buff, "%s", GSM_WARNINGS[2]);
          sendWarning(buff, getId());
          if(mqttFailures >= 3) {
            mqttFailures = 0;
            gsmConnectionState = GsmConnectionState::CONNECTING_GPRS;
          } else {
            gsmConnectionState = GsmConnectionState::CONNECTING_MQTT;
          }
        }  
      }
    }

    //Handle other (future) GRPS actions
    //...
    break;
  }
  default:
    log_e("####### Unknown GSM state #######");
    gsmConnectionState = GsmConnectionState::UNKNOWN;
    break;
  }
}

This is the method that calls the pubsubclient (receiving and pingrequest) and sends the MQTT messages that were queued

bool Sim7000Sensor::handleMqttActions() {
  bool result = false;
  if (takeGsmSerialSemaphore("mqttLoop")) {
    result = mqttClient.loop();
    giveGsmSerialSemaphore();
  } 
  if (result) {
    sendQueuedMessage();
  }
  return result;
}
I-Connect commented 2 years ago

I enabled AT logging to a file to get more data:

This is where I suspect it starts to fail:

AT+CARECV?
+CARECV: 0,0
+CARECV: 1,0
OK
+CASTATE: 0,0

The +CASTATE: 0,0 is returned without querying the state, the AT command manual says:

image

So it seems the connection is "closed by remote server or internal error" Unfortunately the SIM7000 does not allow me to set/query this CACFG autoclose value.

Obviously this could happen, there are many components that could disturb the connection (only odd thing is that is happens every time after approx 14 hours.)

As you can see in above code I do catch this and eventually reboot the SIM7000 and everything works again for 14 hours.

But before I reboot I also first try to reconnect MQTT, and if this fails 3 times I first do a re connection of GPRS. MQTT reconnect always fails (also after a successful GPRS re connection, I know this is successful as the NTP sync goes oke)

MQTT reconnecting:

AT+CASSLCFG=0,ssl,0
OK
AT+CASSLCFG=0,protocol,0
OK
AT+CSSLCFG="sni",0,"server_url"
OK
AT+CAOPEN=0,"server_url",1883
+CAOPEN: 0,4
OK

The result code 4 means Parameter invalid, but this is the same set of parameters I send each time (also when it succeeds)

Only thing could be that there is something wrong with cid 0 ( the first parameter).

Any thoughts on how I can resolve this without the need of rebooting the module?

thx!

romoloman commented 2 years ago

I had a similar problem on a SIM7600E before forcing time syncronization with network by issueing a AT+CTZU=1 command. Since my modem is syncronized with the network I haven't ever more experienced issues.

I-Connect commented 2 years ago

thx

The Sim7000 documentation does not list CTZU as a AT command.

I do already update the time:

sim7000.NTPServerSync("ntp.time.nl", 0);

byte NTPServerSync(String server = "pool.ntp.org", byte TimeZone = 3) {
    // Set GPRS bearer profile to associate with NTP sync
    sendAT(GF("+CNTPCID=1"));
    if (waitResponse(10000L) != 1) { return -1; }

    // Set NTP server and timezone
    sendAT(GF("+CNTP="), server, ',', String(TimeZone));
    if (waitResponse(10000L) != 1) { return -1; }

    // Request network synchronization
    sendAT(GF("+CNTP"));
    if (waitResponse(10000L, GF(GSM_NL "+CNTP:"))) {
      String result = stream.readStringUntil('\n');
      Serial.print("NTP server sync result");
      Serial.println(result);
      result.trim();
      if (isValidNumber(result)) { return result.toInt(); }
    } else {
      return -1;
    }
    return -1;
  }

But I realize I sync using timezone 0 (UTC) and later on in the code I correct that to the correct timezone for use in the application.

Could it be the time difference between the GSM module (UTC) and the MQTT server (UTC +2) could cause the connection to fail over time...?

I will give it a try to using UTC+2

will post the results

I-Connect commented 2 years ago

unfortunately the correct timezone did not resolve the issue.

Again after approx 14 hours it looses connection and after a reboot of the sim7000 the connection is up again

SRGDamia1 commented 2 years ago

Were you able to resolve this?

I-Connect commented 2 years ago

No, I suspect it is a connection that is reset somewhere by a provider. I am planning on moving the backend with MQTT broker to a cloud platform, hopefully this will resolve it.

roysG commented 2 years ago

I have the same issue!! but instead of 14 hours, i get the same problem after each 60 minutes. I live in israel and there everything is working properly. Few days ago i went to (Czechia) in Europe, and the connection is reset after 60 minutes.

I really did not understand why is happen and how to fix it, if someone has ideas that he wants to try, let me know.

I am using in sim7070g, i still did not try to synchronisation the network time, someone may assist with this?

If you have another suggestions i am open to hear. Thanks

roysG commented 2 years ago

The problem solved, if someone has this issue in the future. Please read on the command AT+CRATSRCH in the documentation of sim7070g/7080g

roysG commented 2 years ago

Hi, The problem is not fully solved. I just restarted the network connection before the maximum default timer reached.

It sound so stupid, but in sim7070g, when the network is GSM, the mode of AT+CRATSRCH is enabled by default.

I still did not find any way to turn it off