carrascoacd / ArduinoSIM800L

Arduino HTTP & FTP client for SIM800L/SIM800 boards to perform GET and POST requests to a JSON API as well as FTP uploads.
159 stars 58 forks source link

potential loop bug while transfering data #47

Closed jaka87 closed 3 years ago

jaka87 commented 3 years ago

Hi.

It a long shot but i have to ask anyway. My question to everybody is what is the longest uptime you have using this library ? I use 2 weather stations and we just set up the third. The oldest is about one and a half year old. At first there was a lot of struggle to fix arduino and hardware bugs (bootloader) but we manage to sort it out. One of our station suddenly stopped responding few days ago. Before it was working without interruption for about half a year

Battery should be fine - last few reading show around 94%. Visually station looks fine. Last time we experience something like this was in february. Station went down for few days, until battery drop so much to trigger arduino reset. Then it came back (when charged) and work normally...I use a lot of sleep in my script to prolong battery life so while working normally battery should last 10-14days at least even if the charge was to be cut off. I believe that battery drained so fast since it my Arduino was in some sort of loop while connected on GPRS. I have reasons to believe that that there is not only signal to the SIM800L but also GPRS connection, while Arduino cant pass data forward. Let me explain.

My network provider enables me to view log and do some remote commands. For example today i put new sim card in new weather station without any program on it. Judging by the light on sim800L sim card connected to network but not GPRS. In my network provider console I can then see that the sim card is attached, but get no log for GPRS. Makes sense since there is no apn data to connect to. However on the weather station that is stuck i believe that there is still established connection to GPRS since i get this in log

New location received from VLR for IMSI='x', now attached to VLR='x'. New location received from SGSN for IMSI='x', now attached to SGSN='x', IP='x'.

I can even reset connection from my network provider and after few hours i get same log message as shown above. I researched this log message and it says that GPRS connection is successful. This let me to believe that while sending data arduino got stuck while sim800l is still working. My program is fairly simple so i don't believe there is error there. My theory is that crash might happen when there is just good enough signal to establish GPRS connection but then not good enough to send data which causes the crash. The signal that i got at this location is in range from 13-18, which is not great but works most of the time.

Does anybody have any idea how to debug this further or how to fix it ?

carrascoacd commented 3 years ago

Hi @jaka87

Arduino has a kind of memory that persists after restarts, the EEPROM. You can write there what action are you doing each time, with a simple byte you can write up to 256 values 0 - 255 using one byte. Have a look at https://www.arduino.cc/en/Tutorial/LibraryExamples/EEPROMWrite

Then you can read the byte when your station gets frozen.

For example:


But it is true that the library keeps trying to connect always until it is able to do it.

There are some while loops in the library:

It could be possible that your battery is consumed because of that. Feel free to increment the delays https://github.com/carrascoacd/ArduinoSIM800L/blob/master/src/GPRS.cpp#L62

Since the increment is linear, you can make it exponential with the pow function https://www.arduino.cc/reference/en/language/functions/math/pow/

Example:

delay(pow(2, attemps)); // (You can increment the attemps so each time takes longer than before)

Try with the two approaches, and let me know if it happens again. If the exponential delay helps, I can add it to the library, so we have two kinds of delays: linear and exponential (for cases like yours) by configuration.

evrokas commented 3 years ago

Hello, perhaps you should send a command to close the GPRS connection if OpenGPRSContext fails. I have used SIM800L for a beehive monitoring project before, and what one thing I learned was that one should not assume anything with these little devices. You always have to verify the status. I also use a power MOSFET to switch on and off the SIM800L when not in use, so it was always reset.

Regards, Vangelis

PS I also found this in the SIM900 application note: (which is similar to SIM800L)

** Note from the SIM900_AN_TCPIP_V100: Error handling:

If en error occurs during TCP/UDP connection, it is suggested to close the connection with AT+CIPCLOSE and restart the connection with AT+CIPSTART. If the error still occurs, it is recommended to shut off the PDP with AT+CIPSHUT and restart the connection, if the error still persists, it is recommended to restart the module.

jaka87 commented 3 years ago

Thank you both for suggestions. What i did so far is add watchdog timer on "my" part of the script just to make sure that some sensor is not messing with me. I love the hint with EEPROM @carrascoacd , I did some changes that will be implemented on testing station soon. It might take some time to find out the issue since we had no troubles for more than half of the year. I also like the idea of @evrokas to check if commands are executed. At first glance im not sure if library running in infinite loop is a good idea. I would probably prefer adruino reset after X failed attempts. Then again im not the expert in this field so i could be wrong...

carrascoacd commented 3 years ago

That is a good point @jaka87. We can let the client decide what to do on errors instead of retrying forever by default without sleeping or powering it out. Let me think about it.

Thanks @evrokas that is a good idea, I use a transistor to switch off one relay otherwise my battery lasts one day. What MOFSET are you using? Since the SIM800L suffers a lot with the voltage drops.

evrokas commented 3 years ago

It’s a normal p-channel MOSFET (see attached photo for info). VCC_2 is regulated power suppply, PERIPHERAL_POWER is the Arduino signal to turn on/off power supply and VCC40_SW is the power to the SIM800L module.

The whole setup was based on a custom pcb with a bare ATmega328P cpu, and minimal components so to reduce power consumption.

Using 4x AA 1.5 volt batteries, I managed to have a voltage drop of 0.25v in 26 days, which has a rough estimate of 146 days of operation without changing batteries.

regards, Evangelos

58AD0B28-F813-48B1-B1E2-BE01E9BB4937

carrascoacd commented 3 years ago

Nice. So you have 4x1.5v for ATmega328P and other power supply for SIM800L, right? Maybe a LiPo battery?

jaka87 commented 3 years ago

For my design we used 3.7V li-ion batteries that power arduino and sim800L. We used wide enough traces on PCB so there should easily be 2A o power aveliable, yet from time to time we still got MCUSR reset that indicated voltage issue. I thought of many reasons why this is happening: hardware design - we checked the SIM800 datasheet and used few capacitors batteries - we tried different kind of batteries of many manufactures signal - i noticed that this issue is connected to signal quality. Some locations that were more prone to this issue had lower signal quality, resets were also more prevalent in case of bad weather.

I was beginning to lose hope and started focusing to newer SIMCOM module that support 4G and nb-iot. this issue was hard to pin down since it almost newer happened at my home but always on remote locations. It so happen while i took @carrascoacd advice to write data to EEPROM I also ended up near one of these location for a few days. I noticed that i was getting constant resets inside the house, while just few meters away it worked fine. I also noticed that resets were happening at the same point in code in 90% of the cases. To my surprise it didn't happen while transferring data where i thought there it would be maximum power consumption.

It happended after http.disconnect(); and before http.sleep();

I found it strange to say the least. My thought was that sim is going to sleep to soon and i added few seconds delay, that solved the issue. Another point where reset happened is after http.post where i also added few seconds delay. So far there were no more issues. This is just a quick fix, so in the long run it would probably be best to check if commands successfully executed before continuation like @evrokas suggested. Im not sure if this issue is connected to the first one posted but im happy we are getting closer to maintenance free project.

Back to first issue. We did maintenance on the station that got stuck. On first glance everything looked normal, there was no damaga visible. I did Arduino reset and things were back on normal. While i dont have any more data to prove it i believe that most likely point in my script that could get stuck is while sim800 function.

jaka87 commented 3 years ago

I also looked a bit at your code. It seems to me that after 10 attempts on openGPRSContext() preint function is ran to restart the sim module. I didn't know that before. I read again your answers and I'm not sure we understood each other before. I dont care to much on battery consumption, I mostly care how to prevent for my station to get stuck in loop and not sending data. In that case openGPRSContext should be ok I guess but maybe some other functon would also need a while function to check if commad is successfully execuded and not just assume it. This would in my mind, theoretically fixed both of my issues.

carrascoacd commented 3 years ago

@jaka87 do you mind to upload the code with the sleep and the modifications you´ve done in order to see it in a branch?

The library was designed to allow some commands to fail and then retry the whole flow on the client-side (except when connecting to the bearer) for simplicity, instead of retrying every single command because I had problems, that is why you have to check for SUCCESS as you are already doing in other places. But if you noticed that there is no infinite loop in the connection but after the disconnect and that it goes back to normal after a restart...

Can you reset the Arduino and/or SIM800L when it gets stuck automatically?

jaka87 commented 3 years ago

The most important thing i did is add this delay https://github.com/jaka87/vetercek_WS/blob/development/vetercek_2G/vetercek_2G.ino#L436 I beleive the sleep command is in some cases executed before gprs is fully disconnected. This managed to fix most of my reset issues. I still get some resets from time to time after the https://github.com/jaka87/vetercek_WS/blob/development/vetercek_2G/vetercek_2G.ino#L417 On this one I can't confirm 100% that it's a software issue and not hardware. Battery is in fact brand new but i haven't test it on dummy load to confirm the specs. The first one is for sure caused by software - i had it on many pcb's and different batteries. Since the change i havent had one.

Thank you for your suggestion links. I agree i have some work to do on client side. I will do some changes and then try to debug further. If i discover something i will post here...

jaka87 commented 3 years ago

I did some testing and using Timer1 running parallel to my script did in fact help in case of some issues. Its a neat solution that can get you out of troubles in rare cases of issues. I guess there is not a bug in this library per se but there are some things users should have in mind.

  1. Theoretically if module is connected to network but fore some reason can't connect to GPRS it will be stuck if you don't write logic or such a case
  2. Make sure module is disconnected before putting it to sleep. If not so in some cases disconnect may not be fast enough which in my case caused my whole setup to restart because of voltage drop and 2.9V BOD limit on my arduino. I used such a function to prevent it...

    timeoutGPRS=millis();
    result = http.disconnect();                                     // http disconnect
    while ( result!=SUCCESS) {
     if( millis()-timeoutGPRS > 7000){ 
          wdt_enable(WDTO_250MS); //watchdog reset
          }
    
    }
    timeoutGPRS = 0; 

BTW @carrascoacd im currently developing new station using SIM7000E chip. It supports 4G, nb-iot and also legacy 2G. Its advantage is much lower power consumption, faster connection... It also uses some new methods to power save while maintaining connection to network. My question is would you include support for such a chip in your library or would you rather that fork is made for a specific chip ?

carrascoacd commented 3 years ago

Nice @jaka87

Thanks for sharing your solution I'll include it as well.

Related to the library I think I can start a new repo for that module. I read about it and it is nice, the code is very similar with the current code.

Let me buy one and I'll tell you more!