ricaun / LoRaNow

LoRaNow Library is a simple LoRa Node <> Gateway communication protocol.
http://loranow.com/
MIT License
64 stars 20 forks source link

ESP32 Gateway crashes on onDio0Rise #6

Open girtgirt opened 5 years ago

girtgirt commented 5 years ago

I'm getting messages over LoRa and sending to HTTPS server over WiFi. All works except once in a few hours (quite randomly) board crashes. I tried with ESP32 v 1.0.2 and 1.0.3.

I suspect that issue is with interrupt not having ICACHE_RAM_ATTR as it is needed for ESP32. Simply adding it won't help as per ESP doc: "Not only that, but the entire function tree called from the ISR must also have the ICACHE_RAM_ATTR declared." So all functionality from interrupt should be moved to the main Loop?!?

As Gateway I'm using Heltec ESP32 - Heltec WiFi Lora 32(v2). Top of stack almost always is:

Decoding stack results 0x401624e6: spiGetClockDiv at /Users/aaa/Library/Arduino15/packages/esp32/hardware/esp32/1.0.3-rc1/cores/esp32/esp32-hal-spi.c line 291 0x400d543f: SPIClass::beginTransaction(SPISettings) at /Users/aaa/Library/Arduino15/packages/esp32/hardware/esp32/1.0.3-rc1/libraries/SPI/src/SPI.cpp line 130 0x400d4b30: LoRaClass::singleTransfer(unsigned char, unsigned char) at /Users/aaa/Documents/ArduinoESP32/libraries/LoRaNow/src/utility/LoRa.cpp line 755 0x400d4b64: LoRaClass::readRegister(unsigned char) at /Users/aaa/Documents/ArduinoESP32/libraries/LoRaNow/src/utility/LoRa.cpp line 741 0x400d4ba5: LoRaClass::available() at /Users/aaa/Documents/ArduinoESP32/libraries/LoRaNow/src/utility/LoRa.cpp line 356 0x400d4a02: LoRaNowClass::onReceive(int) at /Users/aaa/Documents/ArduinoESP32/libraries/LoRaNow/src/LoRaNow.cpp line 585 0x400d5282: LoRaClass::handleDio0Rise() at /Users/aaa/Documents/ArduinoESP32/libraries/LoRaNow/src/utility/LoRa.cpp line 724 0x400d52a6: LoRaClass::onDio0Rise() at /Users/aaa/Documents/ArduinoESP32/libraries/LoRaNow/src/utility/LoRa.cpp line 767 0x40080f81: __onPinInterrupt at /Users/aaa/Library/Arduino15/packages/esp32/hardware/esp32/1.0.3-rc1/cores/esp32/esp32-hal-gpio.c line 220 ...

ricaun commented 5 years ago

The first version of the loranow have this problem, the esp32 doesn't like the interrupt and crash alot if you put too many code or something.... I probably need to remove the all 2 interrupt and change the state machine to the loop function. I'm working in this...

ricaun commented 5 years ago

Hello, I created this fork to try to fix the problem, https://github.com/ricaun/LoRaNow/tree/update I add ICACHE_RAM_ATTR on the principals interrupt functions and I never get the fatal error anymore. See ya!

girtgirt commented 5 years ago

I tried this also, but still, getting an error after a longer run. I even added this attribute to all LoRaNow and LoRa functions, but still getting as they call other I2C functions that don't have it.

ricaun commented 5 years ago

I still got the same error =( I move all the decode stuff to the loop, I believe the spi read on the interrupt make the board crashes. https://github.com/ricaun/LoRaNow/tree/update Try this branch and give me a feedback thanks!

cristian-fernandes commented 5 years ago

I’ve tested the update branch. Couldn’t do a long running test, but tested for half-hour and it stayed working fine.

Did you guys have the chance to test it?

Is it working on the long term ?

I’ll test a little bit more and come back here with results.

TRudolphi commented 3 years ago

I think I found the issue of this crash. I am using this very nice library to handle 20 Lora sensors with a ESP32 gateway. I used the Loranow examples as starting point and also fixed the correct defines for ISR in IRAM. But once per hour the ESP32 was crashing (with same report as described above). So I loaded a LoRa only receiver program to the ESP32 board and checked the raw data. But from time to time a very big message of more than 128 bytes was coming in (in the Netherlands a telecom firm is also using LoRa). This was the reason for the crash. The solution is simple:

In LoRaNow class add a range check in the write function:

size_t LoRaNowClass::write(uint8_t c) { if(payload_len < sizeof(payload_buf)) { payload_buf[payload_len++] = c; return 1; } else { return 0; }
}

Now the gateway is running without any crash! I even can set the buffer length to say 32 bytes (which is enough for the small loranow messages).

khachikyannarek commented 3 years ago

I think I found the issue of this crash. I am using this very nice library to handle 20 Lora sensors with a ESP32 gateway. I used the Loranow examples as starting point and also fixed the correct defines for ISR in IRAM. But once per hour the ESP32 was crashing (with same report as described above). So I loaded a LoRa only receiver program to the ESP32 board and checked the raw data. But from time to time a very big message of more than 128 bytes was coming in (in the Netherlands a telecom firm is also using LoRa). This was the reason for the crash. The solution is simple:

In LoRaNow class add a range check in the write function:

size_t LoRaNowClass::write(uint8_t c) { if(payload_len < sizeof(payload_buf)) { payload_buf[payload_len++] = c; return 1; } else { return 0; } }

Now the gateway is running without any crash! I even can set the buffer length to say 32 bytes (which is enough for the small loranow messages).

@TRudolphi could you please provide fixes that you done for ISR?

TRudolphi commented 3 years ago

In LoRaNow.cpp and Lora.cpp:

if ESP8266

#define ISR_PREFIX ICACHE_RAM_ATTR

else

#if ESP32
    #define ISR_PREFIX IRAM_ATTR
#else   
            #define ISR_PREFIX
#endif

endif

But with this fixed, I had still crashed due to too long messages, so the final solution was the range change in the write routine (this fix is in my previous message)

khachikyannarek commented 3 years ago

@TRudolphi what version of LoRaNow you use, from the update branch or from Master?

TRudolphi commented 3 years ago

I used the update branch.

khachikyannarek commented 3 years ago

@TRudolphi one more question, did you updates Lora.cpp?

TRudolphi commented 3 years ago

I only changed the IRAM define in Lora.cpp as I mentioned earlier, the rest of the code is working fine.

khachikyannarek commented 1 year ago

@TRudolphi could you please help me with one question? I have created an irrigation system automation application using this protocol but my nodes are crashing after random periods: the LoRa module receiver or sender part is stopped working. Can it be related to txPower or not? Any idea?

TRudolphi commented 1 year ago

I don't think it has something to do with the tx power. What processor do you use? When using an ESP32 / 8266 there can be a problem with handling of the Dio0 interrupt. So for my ESP32 gateway I made a change in the lib and now I poll the status of this line (state-change of this line is not too frequent). After this change it is working for several years now. When using a AVR controller it should also work fine with the interrupt.

khachikyannarek commented 1 year ago

@TRudolphi thanks for your answer. I also made the same change that you suggested in the top comments. I use Heltec WiFi ESP32 (v2). It seems that the problem is not related to interrupts because in that case after restart ESP32 should start working but in my case I facing an issue with the LoRa module, in some cases LoRa receiver part isn't working(nothing receiving even if the sender is closer) and in other case LoRa sender part