arduino-libraries / MKRGSM

GNU Lesser General Public License v2.1
54 stars 51 forks source link

MKRGSM library not reliable #66

Open alanoatwork opened 5 years ago

alanoatwork commented 5 years ago

I've posted this before, but I'll try again. My code is pretty simple. I check for SMS messages and then reply back using Hologram's network. I'm using the latest MKRGSM library 1.3.1. I also have a 1400 mAhr lithium battery connected so I'm confident that I don't have a hardware issue related to modem current.

Here's my code:

#include <MKRGSM.h>
const char PINNUMBER[] = " ";
const char GPRS_APN[] = "hologram";
const char GPRS_LOGIN[] = " ";
const char GPRS_PASSWORD[] = " ";

String HOLOGRAM_DEVICE_KEY = "********";
String HOLOGRAM_TOPIC = "_SOCKETAPI_";

GSMClient client;
GPRS gprs;
GSM gsm(1);                                             // Enable debug
GSM_SMS sms;
GSMScanner scan;

char server[] = "cloudsocket.hologram.io";
int port = 9999;
boolean isSMSAvailable = false;
char sms_message[145];

void setup() {
  Serial.begin(115200);
  //while(!Serial);

  scan.begin();
  connectGSM();
}

void connectGSM() {
  boolean connected = false;

  while (!connected) {
    Serial.println("Begin GSM Access");

    if ((gsm.begin() == GSM_READY) &&
        (gprs.attachGPRS(GPRS_APN, GPRS_LOGIN, GPRS_PASSWORD) == GPRS_READY)) {
      connected = true;
      Serial.println("GSM Access Success");
      Serial.println(scan.getCurrentCarrier());
    } 
    else {
      Serial.println("Not connected");
      delay(1000);
    }
  }
}

void loop() {
  if(Serial.available()) {
    char c = Serial.read();
    if(c == 'e')
       MODEM.debug();
    if(c == 'd')
         MODEM.noDebug();
  }

  // Get any new incoming txt messages
  int c;
  if (sms.available()) {
    int i = 0;
    while ((c = sms.read()) != -1) {
      sms_message[i++] = (char)c;
    }
    sms_message[i] = '\0';        // Terminate message
    isSMSAvailable = true;
    sms.flush();
  }

  if(gsm.isAccessAlive()) {
    if(gprs.status() != GPRS_READY) {
      if(gprs.attachGPRS(GPRS_APN, GPRS_LOGIN, GPRS_PASSWORD) == GPRS_READY)
        Serial.println("GPRS ready!");
      else
        Serial.println("GRPS not ready!");
    }
  }
  else {
    Serial.println("Reconnect to GSM...");
    connectGSM();
  }

  // Send message back through hologram
  if(isSMSAvailable) {
    isSMSAvailable = false;

    if (client.connect(server, port)) {
      client.print("{\"k\":\"" + HOLOGRAM_DEVICE_KEY + "\",\"d\":\"");
      client.print(sms_message);
      client.println("\",\"t\":\""+HOLOGRAM_TOPIC+"\"}");
      client.stop();
    }
    else {
      MODEM.send("AT+USOER");
    }
  }

  delay(1000);
}`

It takes anywhere from a few days to a week or more to exhibit the problem. My logs typically look like this, where the incoming SMS message, "Jdjd", gets received and then repeated back to me via Hologram's network.

OK
AT+CMGL="REC UNREAD"

+CMGL: 19,"REC UNREAD","+19495472010",,"18/11/14,03:53:24+00"
Jdjd

OK
AT+CMGD=19

OK
AT+CREG?

+CREG: 0,5

OK
AT+USOCR=6

+USOCR: 0

OK
AT+USOCO=0,"cloudsocket.hologram.io",9999

OK
AT+USOWR=0,21,"7B226B223A22433E383375242B57222C2264223A22"

+USOWR: 0,21

OK
AT+USOWR=0,4,"4A646A64"

+USOWR: 0,4

OK
AT+USOWR=0,20,"222C2274223A225F534F434B45544150495F227D"

+USOWR: 0,20

OK
AT+USOWR=0,2,"0D0A"

+USOWR: 0,2

OK
AT+USOCL=0

OK
AT+CMGL="REC UNREAD"

OK
AT+CREG?

+CREG: 0,5`

However, after a week or so this happens:

`OK
AT+CMGL="REC UNREAD"

+CMGL: 19,"REC UNREAD","+19495472010",,"18/11/20,19:51:00+00"
JDJDJ

OK
AT+CMGD=19

OK
AT+CREG?

+CREG: 0,5

OK
AT+USOCR=6

+USOCR: 0

OK
AT+USOCO=0,"cloudsocket.hologram.io",9999

ERROR

+UUSOCL: 0
AT+USOCL=0

ERROR
AT+CMGL="REC UNREAD"

OK
AT+CREG?

+CREG: 0,5

OK
AT+CMGL="REC UNREAD"

OK
AT+CREG?

+CREG: 0,5

From then on, the library is never able to recover.

Any help would be greatly appreciated. The code I submitted is a stripped down version of my application. I've been wanting to release my application using the MKRGSM board but I've been haunted by this issue. I see there's some asynchronous options that aren't well documented and I'm reluctant to use this approach - besides, shouldn't this library be rock solid as documented!

Nels52 commented 5 years ago

@eiriksels It appears that the issue you are working now has nothing to do with MKRGSM library hangs and is a hardware issue similar to those referenced earlier by @B-Clever. Issue #81 also appears to be similar.

Just ignore my earlier debug recommendation since you and Arduino tech support appear to be on the right track by bypassing the PTC (I am guessing that the PTC being bypassed is the one for the battery connection).

As an interesting aside I don't see this issue for one or more of the following reasons:

  1. My device is not mobile so I don't have handoffs between networks.
  2. When I look at my device on the Hologram web site I see it always connects with a 3G ATT Mobility network. The poster for Issue #81 only sees issues with 2G networks.
  3. I do not use a LiPo battery to power my MKR GSM 1400. Because it is stationary I use a 5V adapter which connects via the VIN pin. If I am reading the MKR GSM 1400 circuit diagram correctly it appears that only the battery connection and the 5V output pin have a PTC to regulate the current.

Good luck! Hopefully this hardware fix will work for you.

StefanOelsner commented 5 years ago

@Nels52 The same is true for issue #81. My MKR 1400 is powered by the VIN with a well filtered 2amp source. At least for me it doesn't seem to be an isolated hand-off problem as it won't connect to 2G period even if I force it to gsm only (and having good 2G coverage).

I wonder if hangs especially in moving conditions might be due to locally non available 3G network and the unit unsuccessfully trying to fall back to 2G...

Has anybody alerady established a working 2G-connection with the MKR 1400?

alanoatwork commented 5 years ago

The PTC is a resettable fuse with a finite resistance, which I'm guessing the designers have concluded results in a voltage drop in certain conditions that exceeds the design minimum. When I performed my mobile trials my LiPo battery was fully charged and I was powering the Arduino via the battery. However, these batteries are probably not all the same, with some exhibiting more or less ESR. I intend to power my project directly from Vin and avoid this issue altogether. @eiriksels - thanks for the update!

B-Clever commented 5 years ago

@StefanFiorese Yes, we have a working 2G connection on 900 MHz on T-Mobile. In stationary condition it seems to work fairly well, but moving around a little bit (less than 500 meters is enough) the MKR hangs again. Reception is about 27 of 31 units. But sometimes it is even impossible to connect to 2G because the MKR hangs right on startup. (I can hear the celluar network connection attempt in my computer speakers, then nothing). I had days, on which it was impossible to get a connection at all. The Base Transceiver Station can advise the Mobile Station to transmit with one of 15 power levels; from 2 Watt maximum down to 3.2 mW. On 1800MHz Band from 1W maximum down to 1.0mW. But I think the first transmit burst is always with maximum power.

I´m not sure, if bypassing the fuse will solve the problems. I powered the device with a >2000mAh 18650 LiPo and with a 5A / 5V power source through the VIN pin. I assume that the fuse will not make a difference because the power should be delivered by the VIN pin, if the battery is to weak or the fuse is triggerd.

Working in 3G mode doesn't seem to be a problem. The transmit power is much lower. Tested here with Vodafone.

I am not an RF engineer, but I assume something like a PCB trace that is to thin in cross-section to deliver the peak-current or a PCB trace that has a unfortunate length regarding the 2G wavelength (e.g. λ/4 or λ/8) and becoming harmonic and sensitive to the transmitted RF power, causing strange effects in the MKR circuits.

Anyway, this will make the troubleshooting very complicated to narrow down the problem. Various reception scenarios, different antennas, different power sources etc. I hope the MKR NB 1500 will do a better job; already arrived a few weeks ago.

eiriksels commented 5 years ago

@Nels52 Yes, I think your cause number 1 is the primary. I have never had any hangs with the unit being stationary with the current library setup.

@StefanFiorese What kind of sketch are you running now? I believe that @alanoatwork @Nels52 and myself are now running the asynchronous connection based on the @Nels52 sketch HomeMonitor V5. As far as I've understood so far, @Nels52 has had zero hangs with this. @alanoatwork observed hang when driving around, as I have seen. I have not seen hangs since shortening the PTC, but I have only run it for 3 days so far with transmissions every 2 minutes, so still too soon to conclude. I need to be driving longer distances as well and to areas with poor coverage.

If you do not run this sketch, but the regular approach, then a timeout for the connections by e.g. 20 sec was working to some extent for me: gprs.setTimeout(20000); gsm.setTimeout(20000);. I do not know if you have these implemented.

StefanOelsner commented 5 years ago

@eiriksels I'm using the "ReceiveSMS" example sketch from the library in debug mode so I can eliminate any code-related problems I might have caused. It runs in synchronous mode. I did use gsm.setTimeout(30000). It helped to avoid a complete hang but it did not help to get connected to 2G at all. It just called gsm.begin() every 30 seconds without ever actually connecting. Occasionally this is the case with 3G as well. It hangs after a couple of AT+CREG? commands that report +CREG: 0,0 back. There is a final AT+CREG? that doesn't get answered. When successfully connecting to 3G I see much more AT+CREG? entries that eventually give +CREG: 0,1 back followed by AT+UCALLSTAT=1.

I will use the asynchronous approach and lower the baud rate in the library. Maybe calling AT+CREG? in very short succession doesn't help... Will report back on the findings. I also will have a look at the current draw when trying to connect to 2G.

@Nels52 I couldn't find the unmodified Version of your monitor V5 code, I'm new to github, is it still available? Thank you!

Nels52 commented 5 years ago

@StefanFiorese My complete sketch is shown in Issue #27 in a posting I made on Jan 17. The sketch implements the GSM, GPRS, and GSMClient state machines asynchronously. I did this in order to have more control in the progression of these state machines as my sketch established GSM and GPRS network connections and GSMClient TCP connections to the Hologram server. The key routines are startWebClient() which drives the GSM and GPRS state machines asynchronously and connectClient() which drives the GSMClient() state machine asynchronously.

This implementation avoided any possible library hangs. That being said it looks like you are experiencing hardware issues that may be related to power draw as documented by other posters in this thread.

Anyway, feel free to try it and Good Luck!

StefanOelsner commented 5 years ago

@Nels52 Thank you very much.

eiriksels commented 5 years ago

Hi all.

So, I managed to get a hang on the board again. It is now much more stable with the bypassed PTC, but it was still possible to hang it after driving all day yesterday. It is still the same kind of hang that I have seen earlier: That the Board only fires AT+CREG? requests for like 4-5 Seconds and then the modem hangs or the arduino stops firing the AT+CREG? requests. I think it is a bit odd because it does look like all requests are answered from the modem. The last I see is a:

0:42:38.388 -> AT+CREG?

10:42:38.388 -> +CREG: 0,0 10:42:38.388 -> 10:42:38.388 -> OK

I do have the following question with my recent troubleshooting:

If I increase the time between each AT command to 1 sec by:

…... do { gsmReadyStatus = gsmAccess.ready(); startWebClientInitializationCount++; delay(1000); Changed this from the 100 ms that was in the Homemonitor sketch originally
} while ((gsmReadyStatus == 0)); ……...

In the startWebClient().

Then I almost always get GSM registered on first shooting of the AT command. If this is somehow a power related issue, could increasing the time between each command be a way to have the modem "breathe" a bit more and maybe prevent this hang?

What do you think guys?

StefanOelsner commented 5 years ago

@eiriksels I see the same behavior in my setup. I seem to get an answer from the modem, but there is isn't any communication between the arduino and them modem after that.

As far as I understand AT+CREG? doesn't register the modem to network, it just asks the modem, if it is already connected. The fact that you are increasing the interval to 1000 msecs just gives the modem more time in the background to actually connect to the network. So it might well be that you get a registration confirm on your first AT-CREG? request.

You could also simulate the registration sequence by using the AT-Send example and put in the commands by hand. I did that and managed to get the same result. At some point I couldn't get anything through to the modem anymore. The serial connection seemed to be interrupted somehow...

eiriksels commented 5 years ago

Ok, I understand. I do have some questions though:

Do you not think that the AT requests put extra load on the modem?

If it is the modem that fails, should we not then see that the last line of communication is AT+CREG?. But what you are saying is that it is actually the serial communication that is broken which means that the command is not even transferred to the modem?

When I did the Watchdog reset attempt it seems to still fail on more or less exactly the same point at AT+CREG?, but before that it responds very well to all the commands before that. I am considering starting the setup routine with a modem shutdown --> wait --> startup. This is because I am guessing that even with the Watchdog reset of the controller, the modem might be in some sort of "unhealthy" state.

Nels52 commented 5 years ago

@eiriksels said

I think it is a bit odd because it does look like all requests are answered from the modem. The last I see is a:

0:42:38.388 -> AT+CREG?

10:42:38.388 -> +CREG: 0,0 10:42:38.388 -> 10:42:38.388 -> OK

Looking at the Modem.cpp logic the ModemClass::poll() method traces the command ECHOed from the modem not the command sent. Therefore, the AT+CREG? command could have been issued to the modem but no response was received.

If you want to verify that the command was sent insert the Serial.println("Sending AT+CREG?"") in the GSM::ready() method of GSM.cpp.

   case READY_STATE_CHECK_REGISTRATION: {
      MODEM.setResponseDataStorage(&_response);
      Serial.println("Sending AT+CREG?");
      MODEM.send("AT+CREG?");
      _readyState = READY_STATE_WAIT_CHECK_REGISTRATION_RESPONSE;
      ready = 0;
      break;
    }

This will allow you to match AT+CREG? command sends with AT+CREG? command responses.

eiriksels commented 5 years ago

Today I was out driving for 4 hrs. I experienced a hang in a low signal reception area. The hang was so bad that even the charge light did not come up when I Attached the 5v micro USB Connection. The computer did not reciognize that anything had been Connected to the computer. And my blinking led light(I have implemented this in my code) had stopped, so there was no program running. The only recovery was to manually press the reset-button and everything came back to life.

alanoatwork commented 5 years ago

@eiriksels, Were you battery powered during this trip? If so, then perhaps even though you shorted out the PTC (which gave you a bit more headroom) perhaps your battery pack is exhibiting too much resistance to keep the voltage high enough for reliable operation after a few hours. Can you perform a mobile trip while powering the board via the VIN pin with a 5 V, 2.5 A or higher current supply, or can you simply procure another LiPo battery, perhaps from a difference source.

eiriksels commented 5 years ago

@alanoatwork Yes, it was only Powered With my LiPo at that point. I do regular checks for the voltage and it was showing ok values compared to what I have seen before(analogRead(ADC_BATTERY) was at 915 out of 1023 which should correspond to approx 3.85 v). But I might need to limit it to stay almost full to not have it hanging. I do have other larger batteries that I could test out.

Rocketct commented 5 years ago

@eiriksels maybe a larger battery could not works, the suggested one is a 1500 mAh for the device , consider that the Sara module work at 3.8 V this is aligned with your results, when you are in bad service condition the board increase the consumption and discharging the battery and reaching this value could cause the hang. I have another question you use in your sketch the http request to the blynk server for communicate with the smartphone app? if yes was released the library support for the MKRGSM that could avoid the use of the request and do it in transparent way through the blynk API.

@alanoatwork : could you share your complete restart procedure?(only the code section could be ok, i would like to understand which are the steps that you do to restart the module)

@Nels52: could you always patch the board and test it with your power supply and the battery this should increase the stability, board, i told this because on my side the longevity of the board increase respect using a usb supplied board respect one that have USB and battery with the patch!! i start some test to monitor the uart communication in order to investigate the scenario described before.

About the 115200 of time ago, is sended to the module the command for change the value, only if different because the 115200 is the native baudrate of the module than you will never see the at command to change the value baudrate, more over when you restart the module you have to restore the baudrate to the same of the of the samd, because the module baudrate is restored to 115200(i told this because you are using an asynch solution and i don't know which are the step that you do in re connection phase, and if avoided the new baudarate change could results in a hang)

eiriksels commented 5 years ago

@Rocketct Thus meaning that I should implement som logic that shut down the modem whenever my battery voltage is below e.g. 3.9 v?

Yes I am using a simple GET request to upload my data and thereby getting it to my smartphone app. I have tried all kinds of different variants (Including the blynk library variants and TinyGSM library), but this has turned out to be the simplest and most robust way of getting my data through.

Rocketct commented 5 years ago

If not required a continuous transmission of the data that you send to the blynk server yes, could be a good idea, something like ``transmit-> modem shutdown and low power mode ->wake up and transmit```clearly depends on your application.

have you try also this https://github.com/blynkkk/blynk-library/blob/master/examples/Boards_GSM/Arduino_MKRGSM/Arduino_MKRGSM.ino? was released before that you open the other issue here in the git hub

eiriksels commented 5 years ago

@Rocketct I have not tried that one. I tried the TinyGSM version. As I only want data transmission one way every X minutes I think it was more robust and using less data to use the http request approach for my application. And I am able to control the attachment to GSM, which I need to make due to the hardware issues that must be in the card itself which cause these hangs while moving the card around.

I do have one question regarding the hangs in the modem. Has anyone tried limiting the modem to 1 type of connectivity? You can read on page 99 in the AT command manual:

https://www.u-blox.com/sites/default/files/u-blox-CEL_ATCommands_%28UBX-13002752%29.pdf

If you ask for AT+URAT? you get that the modem is in:

1: GSM / UMTS (dual mode) by default

My question is if I change this to either: 0: GSM / GPRS / eGPRS (single mode) or 2: UMTS (single mode), is it then likely that the modem will not as easily crash during these 3g/2g handovers that @B-Clever suggested as the cause, because then they will never happen?

Nels52 commented 5 years ago

@Rocketct wrote: " could you always patch the board and test it with your power supply and the battery this should increase the stability, board, i told this because on my side the longevity of the board increase respect using a usb supplied board respect one that have USB and battery with the patch!!"

I am wintering away so I don't have physical access to my MKRGSM 1400. Also, my only power source to the unit is a 5V power adapter connected to a wall socket on one end and to the VIN pin at the other end. This adapter is able to deliver up to 2.4 amps of current when necessary. I do not have a LiPo battery connected and from the MKRGSM 1400 circuit diagram it looks like the PTC only applies to the battery connection.

As a general observation this thread seems to have become a hodgepodge of the following issues:

  1. Power issues which appear to be aggravated by 2G/3G switching.
  2. 2G connectivity issues. 3G seems to work fine.
  3. UART interface hangs which appear to be related to issues 1. and 2.

As @B-Clever has observed it would be difficult to debug these issues. Here is a proposal for what it's worth:

  1. Use the AT+URAT command to restrict modem operation to 3G (UMTS) or at least provide a GSM::begin() option to do this. This may not be a big deal since 2G service seems to be going away.
  2. Upgrade your Arduino SAMD boards to 1.6.20. I see that the Uart::IrqHandler() method has a change to handle UART frame errors. This may or may not have anything to do with the UART interface hangs but it is best to have this change in case it does.
  3. If you are using a LiPo battery make sure it can provide adequate current and use the PTC bypass.
  4. See if you experience any hangs after these changes.
alanoatwork commented 5 years ago

@Nels52, can't 3G operation be restricted by using the GSMBand class:

GSMBand band;

void setup() {
  band.setBand(GSM_MODE_UMTS);
}
Nels52 commented 5 years ago

@alanoatwork I am using the MKRGSM library version 1.3.1 and the GSMBand::setBand(String band)) code in that version uses the AT+UBANDSEL command to set the bandwidth based on one of the following String values:

GSM_MODE_EGSM - 900 Mhz GSM_MODE_DCS - 1800 Mhz GSM_MODE_PCS - 1900 Mhz GSM_MODE_EGSM_DCS - 900, 1900 Mhz GSM_MODE_GSM850_PCS - 850, 1900 Mhz GSM_MODE_GSM850_EGSM_DCS_PCS - 800, 850, 900, 1900 Mhz

All other values are not valid and the GSMBand::setBand() method doesn't send the AT+UBANDSEL command and returns false to the caller.

I don't see any method that uses the AT+URAT command to set the Radio Access Technology. Are you using a different version of the MKRGSM library?

alanoatwork commented 5 years ago

@Nels52. I'm using 1.33 and they've added GSM_MODE_UTMS. There's also a setRAT command but it's currently declared private. Looks like they haven't released these changes yet.

Nels52 commented 5 years ago

@alanoatwork Thanks for the info. I will download 1.3.3 when I get home.

alanoatwork commented 5 years ago

@alanoatwork : could you share your complete restart procedure?(only the code section could be ok, i would like to understand which are the steps that you do to restart the module)

@Rocketct, my sketch is based on @Nels52's HomeMonitor V5 that he's made reference to. I haven't added any code to restart the board. I haven't even added a WDT yet because I find solutions like these tend to cover-up other issues.

Nels52 commented 5 years ago

@Rocketct My complete sketch can be found in a Jan 17 post in Issue #27.

The main components are startWebClient() and connectClient(). startWebClient() drives the GSM and GPRS state machines asynchronously using the GSM::ready() and GPRS::ready() methods. connectClient() drives the GSMClient state machine asynchronously using the GSMClient::ready() method.

Rocketct commented 5 years ago

@alanoatwork @Nels52 ok, i will check thank you!

B-Clever commented 5 years ago

I have a serious problem with my MKR now. I bypassed the PTC and charged the LiPo to 4.15 V; Powered parallel through the VIN; use SAMD 1.6.20 and MKR 1.3.3 version.

I uploaded the BandManagement example Code and selected 1 for GSM_MODE_EGSM to explicitly force it into 900 MHz. That seemed to work. But now, I am not able to select anything else or change it to the previous state. (For me, it is clearly a problem having the module operating on 2G.) If the board starts with the BandManagement Code, the module seems to register itself on the network (I can hear it in my speakers) and hangs immediatly after a few seconds. After that, the Serial Monitor hangs and you can´t select anything anymore. I tried to use the SerialGSMPassthrough example Code, to communicate with SARA directly, but again, the board hangs immediatly because SARA attemps to register on the 2G network and hangs. The Serial Monitor hangs as well after a few seconds (when I hear the sound of the SARA transmitting in my speakers).

Now I can´t get out of this state.

I read the u-blox AT Commands Manual, and there is a note for the SARA-U2 on page 119 (AT+UBANDSEL) that says in general: To make the setting effective, the module must be deregistered and registered again. (see Notes for the procedure to enter the detach state). and for SARA-U2 Issue the AT+COPS=2 AT command to detach the module from the network.

I don´t see were this is done in the GSMBand Code.

B-Clever commented 5 years ago

I brought the MKR back to live. 1: I removed the SIM card to prevent the MKR from registering on the 2G network. 2: Uploaded SerialGSMPassThrough example code (no broken serial monitor anymore, because no 2G hang) 3: Set AT+URAT=2 to only allow 3G single mode operation 4: Set AT+UBANDSEL=900,1900,2100 to allow connecting on the respective frequencies.

First startup was successfully. I will test during the next weeks being in 3G only mode. Stationary and while driving in a car. I think there is a serious problem using the MKR in URAT=1,0 mode or anything else than 3G only. As soon as it falls back from 3G to 2G or as it starts up in 2G because of no 3G reception, it hangs immediatly in my case.

alanoatwork commented 5 years ago

@B-Clever I wonder if you have tried or can try AT_URAT=1 to restrict the modem to GSM / UMTS (dual mode) rather than perhaps falling back to GSM / GPRS / eGPRS (single mode). Unreleased code in the repository can configure URAT to 1,2 rather than 1,0 using GSM_MODE_UMTS. For the moment there isn't a GSM_MODE_UMTS_ONLY option.

B-Clever commented 5 years ago

@alanoatwork I tried URAT=1,2 und URAT=1,0. With both I had problems and hangs. Now I have three MKRs running with URAT=2 and UBANDSEL=2100 and they are working fine right now. They are connected to three different operators (Vodafone, T-Mobile, o2). I went down to the basement and lost cellular reception but the MKR successfully recovered from that situation. It just reconnected upstairs, no hang. The T-Mobile network is really weak at my location. On a scale from 0 to 31 it is between 0 and 3. But there were no hangs during the last few hours (and no reconnects too). Two MKRs will stay here, and the unit connected to the Vodafone network will be in my car during the next week. Powered with LiPo and VIN >4 Amps and PTC bypassed. By the way: only one of my three MKRs has a bypassed PTC.

I haven´t tried URAT=1 yet. I will try that, but first, the MKRs will be in URAT=2 to see if all of the hangs, even in extreme situations, are gone. If URAT=1 (2G/3G dual mode) leads to hangs again (with the same code), for me it is clearly a problem with 2G, probably hardware related.

I could live with that situation until 3G network will be shut down till end of year 2020. After that, only 2G and LTE/5G will be available. Interestingly the 2G network will stay much longer than 3G because many M2M contracts with various network operators run very long. I think mostly until 2025.

eiriksels commented 5 years ago

@B-Clever I have been hang free since I changed my URAT to UMTS single mode. Unfortunately, the 3G at my location will start shutting down already now in a few months, so it is not a long term solution for me and this board. I have been around a lot with many connections/disconnections to the network, but always been able to recover the signal by itself.

Just FYI I have been testing out the NB1500 board, and as far as I have tested the stability is worse than on the MKR1400. I will not go into details, but to me it seams that the serial comms between the coontroller and the modem is "fragile" and could be completely gone if the modem is in some states. I guess that the designs are pretty similar between the MKR1400 and the NB1500 and that the issues are related.

eiriksels commented 5 years ago

I am now running 1 MKR GSM in AT+URAT=0 (2G single mode) in addition to having 1 board running in AT+URAT=2 (UMTS single mode)

The UMTS single mode seem stable and has been running for a week now. I just started the 2G single mode unit and it remains to see how that one behaves in low signal areas and when moving.

eiriksels commented 5 years ago

After doing longer periods of testing I am pretty sure that it is the 2G 900 mhz connection that causes hangs in the board. All other frequencies seem fine. You can also see in the product manual for UBLOX modem that this frequency has very high tx peak current draw. I am only able to run the board in this mode with a 10000 mAh power bank attached. And even then I see several hangs on the board. With smaller batteries, the board hangs almost instantly on the AT+CREG? when I lock the connection to this band. In my location this makes the board useless for a longer timespan. The UMTS bands are to be closed down this year by the network providers, and with the 2G being so unstable I can't use the board.

pnndra commented 5 years ago

hi. as anyone of you guys tried shorting F2 (you can find that on the schematic: file:///C:/Users/Dario.IPTRONIX/Downloads/MKRGSM_V2.2_sch.pdf). that component is a bit critical as it tends to limit available current from battery and that may cause voltage drops that hang the modem

ubidefeo commented 5 years ago

hey @pnndra the schematic PDF has not been uploaded, I'll link to the official one but you'll have to remove your local path :D MKR GSM 1400 schematic PDF

eiriksels commented 5 years ago

I have tried shortening the PTC as advised by the Arduino technical team. I think it improves the stability, but it still does not help much when the board is trying to connect to the 900 mhz 2G network. So the only way I've managed to get the board stable is by locking the board to UMTS bands by using AT+URAT commands (You can read on page 99 in the AT command manual: https://www.u-blox.com/sites/default/files/u-blox-CEL_ATCommands_%28UBX-13002752%29.pdf)

The downside for me is that the UMTS bands are closing down this year and making the board useless for me as it does not behave stabile in 2G mode.

B-Clever commented 5 years ago

@eiriksels I experience exactly the same behaviour. When I lock the Board to 3G, the board is working fine. No hangs since two months now. If I use 2G, it hangs immediatly (about 10 seconds after supplying power to it). I don´t have a 10.000 mAh Battery connected, only 2200 mAh but additionally Vin connected. PTC shorting did not help anything for me. As @eiriksels said, UMTS is switched off soon, so then this board is useless. I switched to a SIM800 and ESP32.

alanoatwork commented 5 years ago

@B-Clever and @eiriksels, what region(s) are you guys referring to? AT&T says they will keep 3G going at least until 2021 and no announcement by T-Mobile.

eiriksels commented 5 years ago

@alanoatwork Norway for my part. They have already started shutting down in some regions. And according to my understanding the most of 3G will be down within a year here.

eiriksels commented 5 years ago

FYI I got this feedback from Arduino developers on my following question:

Question: When this board was developed, did you try to run it in 2G mode and how did the board behave then? My experience is that it is nearly impossible to get a stable board on the GSM 900 mhz 2G mode.

Answer: Our Arduino developers have checked your issue and we confirm that board is unstable when goes in the 2G mode.

I have asked if there are any ways of fixing this. But with the current state of the board, my clear suggestion is to lock the board to 3G to get it stable and reliable. If 3G is not available in your location for a longer timespan my recommendation is to not spend money and time on the MKRGSM.

Nels52 commented 5 years ago

@eiriksels Thanks for all of your hard work on this issue.

alanoatwork commented 5 years ago

Yikes. I'm still a bit confused. UMTS is supposed to describe 3G according to Wikipedia. In GSMBand.cpp, UTMS is given as #define UMTS_BANDS "1,2". This will send "AT+URAT=1,2". Is this sufficient to prevent attempts to connect to 2G networks?

eiriksels commented 5 years ago

@alanoatwork As far as I see, AT+URAT=1,2 will only set it to the same as default mode which is 3G(UMTS)/2G dual mode. What you want is to set it to AT+URAT=2 (UMTS single mode). You can probably do this with the band management sketch. The way I have done it is by:

  1. Use the serial passthrough example in the MKRGSM library
  2. Give command AT+COPS=2 to make it ready for RAT change
  3. Set RAT by AT+URAT=2
  4. Set AT+COPS=0 to get it back
  5. Check that RAT has been set by asking AT+URAT? This should respond with 2 if it is in UMTS single mode.

Then you can go back to uploading the sketch and the RAT should be stored as long as your sketch does not change it. You can find info on this in the AT manual page 99 https://www.u-blox.com/sites/default/files/u-blox-CEL_ATCommands_%28UBX-13002752%29.pdf

rigoy22k commented 5 years ago

Our Arduino developers have checked your issue and we confirm that board is unstable when goes in the 2G mode.

I really hope some one can fix this by software. Or is hardware issue? there is so many places with just 2g especially on remote regions

B-Clever commented 5 years ago

I think the 2G hangs cannot be fixed by software. Unfortunatly the U201 has LGA form factor, otherwise I would try to find the U201 VCC Pin and solder a thick wire with 3.3V to that Pin. Page 26 and Page 176 of the U201 Hardware Integration Manual is interessting. "The VCC line should be wide and short" (https://www.u-blox.com/de/docs/UBX-13000995)

Rocketct commented 5 years ago

Hi, we have done extensive testing on MKRGSM1400 and found the following: 1) even if you supply board with a powerful power supply the current spikes you get in GSM 800 or 900 MHz mode can be as high as 2A and these cause a significant voltage drop, even with thick cables. it is recommended to add capacitors of several hundreds of microfarads close to the input pins. also, if your supply is 5V current will be higher and drop deeper so it's recommended to raise vin supply to at least 7V if possible 2) in product page we recommend to add an external battery with at least 1500mA capacity. what is really important is not much the capacity but rather the current it can provide. as stated above overall peak current required by the board may exceed 2A so it's recommended to have a larger battery and in particular one able to supply peaks of at least 2.5A. likely a battery with 1500mAh with 2C or more is ok but in any case the important point is peak discharge current that must be more than 2.5A to be safe 3) the battery charger has an undervoltage check that disables it when input voltage drops below a given value. the default setting is a bit high and can be lowered safely to avoid shutting down its output when input voltage drops due to current spikes. of course you should NOT have drops in input voltage and this should be achieved adding external capacitors as wrote in 1) however likely you won't be able to completely remove the drop so we're updating the variant for GSM1400 in the core so that default value is the lowest allowed voltage. 4) R28 is set to 330 ohm. this resistor sets current limit from vin to 1.6A. even if software to an higher value, the chip limits to the lower of the two settings. in general this was a safe setting but if you want to use GSM 800 or 900 this current limit may be too low so it may be useful to lower this resistor or even short it. in case R28 is shorted input current limit is set exclusively by battery charger register

in general our recommendation is to add capacitance to the vin pin and possibly rise input voltage to 7V, while also using a battery with a discharge current of at least 2.5A. upgrading to latest core (will be released at most in a couple of days) is also recommended.

alanoatwork commented 5 years ago

@Rocketct thanks for the info. I think many people will find this information useful and help them get their systems working reliably on 2G. However, there's one point that doesn't make sense to me: " if your supply is 5V current will be higher and drop deeper so it's recommended to raise vin supply to at least 7V if possible"

The board's internal 3V8 supply is generated from the USB input or the Vin input using a DC/DC converter. Therefore as Vin is raised, the current will fall (property of a switching regulator) and the voltage drops will actually be smaller. When you increase Vin you are essentially creating more headroom for the regulator. In either case, if increasing Vin yields better results when using 2G that's good to know.

pnndra commented 5 years ago

@alanoatwork the issue is primarily due to the fact that a steep change in consumed power leads to an impulsive request for more current. the problem we observed is that even with short, thick power supply cables connected directly to the board pins we see a drop in input voltage at the pins and this causes every other derived voltage to drop accordingly. if you raise vin voltage the current step will be smaller and this will facilitate the power supply/cable to provide that current. smaller current step will basically also help with the fact that a parasitic inductance of the cables would "try to prevent" current change which basically means that part of the voltage drop at vin comes from the slope of the current rise, hence lower slope means lower drop. bottom line, it's not just giving more headroom but rather reducing the slope of the current change to reduce overall voltage drop. let me also add that of course while vin is raised and current is lowered, internal 3.8V voltage will still have to sustain the same current step however the DC-DC will have a nicer time trying to regulate the voltage if the input is stable as its duty cycle would have to compensate only change in output current rather than also the lowering of the input voltage.

alanoatwork commented 5 years ago

@pnndra, thanks for your explanation. I noticed there was a reference to an MKR GSM 1400b revision on the Arduino MKR forums. I've unsuccessfully tried to find documentation describing any board revisions. Is there a location where developers can find an errata or list of changes? If this isn't available, could you please summarize the board revisions.