ngraziano / LMICPP-Arduino

Lmic (LoraWAN-in-C) modified to C++
61 stars 13 forks source link

Core panic after a confirmed downlink message #9

Closed marciolm closed 4 years ago

marciolm commented 4 years ago

Hello, The "esp32" example is working fine in the US915 band, even with normal downlink messages, but every time I try to send one "confirmed downlink message" through the TTN console, I got a core panic message end the esp32 is rebooted. The debug messages are the following ( a little before and after the crash):

`251759290 Init Configure 251760430 Status b2 mode : 3, status 1 251760438 TXMODE, freq=904900000, len=14, SF=7, BW=125, CR=4/5 251763363 Not using interupt trigger 0 251763373 Set Radio to sleep 251763380 End TX 251763361 251763386 Rx delay : 972 ms 251821675 Init Configure 251837987 Not using interupt trigger 0 251837995 RX timeout 251838003 Set Radio to sleep 251838009 End RX - Start RX : 220992 us 251838023 Decode Frame RX1 251838175 No downlink data 251838327 Rx delay : 1959 ms 251883302 Init Configure 251895052 Not using interupt trigger 0 251895061 RX timeout 251895068 Set Radio to sleep 251895074 End RX - Start RX : 148000 us 251895090 Decode Frame RX2 251895240 No downlink data 251895392 EV_TXCOMPLETE (includes waiting for RX windows) Guru Meditation Error: Core 1 panic'ed (LoadProhibited). Exception was unhandled. Core 1 register dump: PC : 0x400ec6e7 PS : 0x00060130 A0 : 0x800d29a0 A1 : 0x3ffb1e30
A2 : 0x00000008 A3 : 0x3ffc00f4 A4 : 0x3ffb6610 A5 : 0x3ffb1b20
A6 : 0x3ffb1b20 A7 : 0x00000004 A8 : 0x00000000 A9 : 0x3ffb1e00
A10 : 0x00000000 A11 : 0x3ffb1e48 A12 : 0xf039fa05 A13 : 0xfffd6213 A14 : 0xffffffff A15 : 0xff000000 SAR : 0x0000000c EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000008 LBEG : 0x400014fd LEND : 0x4000150d LCOUNT : 0xffffffff

Backtrace: 0x400ec6e7:0x3ffb1e30 0x400d299d:0x3ffb1e50 0x400d29d9:0x3ffb1e70 0x400d0f05:0x3ffb1e90 0x400d21ae:0x3ffb1eb0 0x400d2361:0x3ffb1ed0 0x400d24c5:0x3ffb1ef0 0x400d24dc:0x3ffb1f10 0x400d252a:0x3ffb1f30 0x400ec404:0x3ffb1f50 0x400d2a12:0x3ffb1f70 0x400d0fbe:0x3ffb1f90 0x400d4c05:0x3ffb1fb0 0x40088215:0x3ffb1fd0

Rebooting... ets Jun 8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT) configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 mode:DIO, clock div:2 load:0x3fff0018,len:4 load:0x3fff001c,len:1044 load:0x40078000,len:8896 load:0x40080400,len:5828 entry 0x400806ac 251621498 Radio Init 251621914 Set Radio to sleep 251621927 Set Radio to sleep 251622061 Packet queued 251622076 Ready for uplink 251622107 Time on air : 370 ms 251622121 Init Configure 251623276 Status 32 mode : 3, status 1 251623287 TXMODE, freq=902300000, len=23, SF=10, BW=125, CR=4/5 251646480 Not using interupt trigger 0 251646491 Set Radio to sleep 251646498 End TX 251646479 ` Any idea ? Thank you Márcio

ngraziano commented 4 years ago

Hi,

Sorry I was not able to reproduce. I try with

I don't have ESP32 with SX1262.

Does the crash append just after the first send with ACK asked ? Normaly the library will retry 8 time before giving up and call TXCOMPLETE.

I see the radio did not receive the ACK packet ("RX timeout" each time) does the crash append when radio receive ACK packet ? of maybe normal packet ?

You can try to put some PRINT_DEBUG(1, F("HERE ...")); in onEvent function and in librairie code to detect where it crash.

Best regards, Nicolas

marciolm commented 4 years ago

Hello Nicolas I tried with a new board TTGO Lora Oled V1 (ESP32+SX1276) and the same problem occurred with confirmed downlink messages. What I noticed there is an immediate transmission of a new packet with the next counter number and port "0" after the confirmation and before the crash. The screen of the TTN console is attached, as well as some debug logs.

---- Downlink not confirmed ------

10639125 Ready for uplink 10639137 Build pkt # 4 10639145 Time on air : 288 ms 10639161 Config power to 17 on PA_BOOST 10639180 TXMODE, freq=904700000, len=14, SF=10, BW=125, CR=4/5 10639449 Packet queued 10657241 Not using interupt trigger 0 10657251 End TX 10657240 10657259 Rx delay : 976 ms 10718270 RXMODE_SINGLE, freq=925700000, SF=10, BW=500, CR=4/5, IH=0 10724979 Not using interupt trigger 0 10724996 End RX - Start RX : 107360 us 10725005 Decode Frame RX1 10725019 Received downlink, port=1, ack=0 10725102 EV_TXCOMPLETE (includes waiting for RX windows) 10725416 Received 5 bytes of payload 10725623 Data:41 10725721 Data:61 10725818 Data:33 10725916 Data:44 10726013 Data:55 12601122 Ready for uplink

---- Downlink Confirmed ----------

16818394 RXMODE_SINGLE, freq=923300000, SF=10, BW=500, CR=4/5, IH=0 16822372 Not using interupt trigger 0 16822380 RX timeout 16822391 End RX - Start RX : 63376 us 16822400 Decode Frame RX1 16822405 No downlink data 16822516 Rx delay : 1959 ms 16879833 RXMODE_SINGLE, freq=923300000, SF=12, BW=500, CR=4/5, IH=0 16889058 Not using interupt trigger 0 16889067 RX timeout 16889078 End RX - Start RX : 147760 us 16889087 Decode Frame RX2 16889092 No downlink data 16889208 EV_TXCOMPLETE (includes waiting for RX windows) Guru Meditation Error: Core 1 panic'ed (LoadProhibited). Exception was unhandled. Core 1 register dump: PC : 0x400ec413 PS : 0x00060130 A0 : 0x800d29cc A1 : 0x3ffb1e30 A2 : 0x00000008 A3 : 0x3ffc00f4 A4 : 0x3ffb6610 A5 : 0x3ffb1b20
A6 : 0x3ffb1b20 A7 : 0x00000004 A8 : 0x00000000 A9 : 0x3ffb1e00
A10 : 0x00000000 A11 : 0x3ffb1e48 A12 : 0x101b6b33 A13 : 0xfffd413e A14 : 0xffffffff A15 : 0xff000000 SAR : 0x0000000c EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000008 LBEG : 0x400014fd LEND : 0x4000150d LCOUNT : 0xffffffff

Backtrace: 0x400ec413:0x3ffb1e30 0x400d29c9:0x3ffb1e50 0x400d2a05:0x3ffb1e70 0x400d0ed9:0x3ffb1e90 0x400d21da:0x3ffb1eb0 0x400d238d:0x3ffb1ed0 0x400d24f1:0x3ffb1ef0 0x400d2508:0x3ffb1f10 0x400d2556:0x3ffb1f30 0x400ec11c:0x3ffb1f50 0x400d2a3e:0x3ffb1f70 0x400d0fea:0x3ffb1f90 0x400d494d:0x3ffb1fb0 0x40088215:0x3ffb1fd0

Rebooting... ets Jun 8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT) configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 mode:DIO, clock div:2 load:0x3fff0018,len:4 load:0x3fff001c,len:1044 load:0x40078000,len:8896 load:0x40080400,len:5828 entry 0x400806ac 16909401 Chip version : 18 16909475 Packet queued 16909491 Ready for uplink


2020-02-12 21_03_06-The Things Network Console

Your library is very useful even with non confirmed messages, thank you. Márcio

ngraziano commented 4 years ago

Hi,

Can you try to recover me the code line which cause the crash.

To do so, you need to use the following steps : Change platformio.ini file and set build_flags to:

build_flags = -Og -ggdb

Compile, upload to the board and make it crash. Keep the backtrace, for example:

Backtrace: 0x400d0ec8:0x3ffb1e90 0x400d1e1a:0x3ffb1eb0 0x400d1fcd:0x3ffb1ed0 0x400d2131:0x3ffb1ef0 0x400d2148:0x3ffb1f10 0x400d2198:0x3ffb1f30 0x400ec03c:0x3ffb1f50 0x400d25fa:0x3ffb1f70 0x400d0faa:0x3ffb1f90 0x400d4911:0x3ffb1fb0 0x40088215:0x3ffb1fd0

Launch GDB on your code (you will need to adapt the path in following command)

C:\users\nicol\.platformio\packages\toolchain-xtensa32\bin\xtensa-esp32-elf-gdb.exe .\.pio\build\esp32\firmware.elf 

And get the source file and line number with gdb command info line with adress in backtrace (first part before ":")

(gdb) info line *0x400d0ec8
Line 59 of "src\main.cpp" starts at address 0x400d0ec8 <onEvent(EventType)+124> and ends at 0x400d0ed0 <setup()>.

And you can continue 1 or 2 time in the bakctrace.

(gdb) info line *0x400d1e1a
Line 135 of "lib\LMICPP-Arduino\src\lmic\lmic.cpp" starts at address 0x400d1e1a <Lmic::reportEvent(EventType)+10>
   and ends at 0x400d1e1d <Lmic::reportEvent(EventType)+13>.

If the problem is in main.cpp please send me a code extract

It will help me to understand the problem. Thanks

marciolm commented 4 years ago

Hello Nicolas, Thanks for the GDB lesson :)

Here is the debug info:

Backtrace: 0x400ec413:0x3ffb1e30 0x400d29c9:0x3ffb1e50 0x400d2a05:0x3ffb1e70 0x400d0ed9:0x3ffb1e90 0x400d21da:0x3ffb1eb0 0x400d238d:0x3ffb1ed0 0x400d24f1:0x3ffb1ef0 0x400d2508:0x3ffb1f10 0x400d2556:0x3ffb1f30 0x400ec11c:0x3ffb1f50 0x400d2a3e:0x3ffb1f70 0x400d0fea:0x3ffb1f90 0x400d494d:0x3ffb1fb0 0x40088215:0x3ffb1fd0

(gdb) file C:/Users/Marcio/Documents/PlatformIO/Projects/sx1276-lmiccpp-ttgo/.pio/build/ttgo-lora32-v1/firmware.elf Reading symbols from C:/Users/Marcio/Documents/PlatformIO/Projects/sx1276-lmiccpp-ttgo/.pio/build/ttgo-lora32-v1/firmware.elf...done. (gdb) info line 0x400ec413 Line 45 of ".pio\libdeps\ttgo-lora32-v1\LMICPP-Arduino\src\lmic\oslmic.cpp" starts at address 0x400ec413 <OsScheduler::unlinkScheduledJobs(OsJobBase&)+3> and ends at 0x400ec417 <OsScheduler::unlinkScheduledJobs(OsJobBase&)+7>. (gdb) info line 0x400d29c9 Line 63 of ".pio\libdeps\ttgo-lora32-v1\LMICPP-Arduino\src\lmic\oslmic.cpp" starts at address 0x400d29c3 <OsJobBase::setTimed(OsTime)+3> and ends at 0x400d29cc <OsJobBase::setTimed(OsTime)+12>. (gdb) info line 0x400d2a05 Line 57 of ".pio\libdeps\ttgo-lora32-v1\LMICPP-Arduino\src\lmic\oslmic.cpp" starts at address 0x400d2a03 <OsJob::setTimedCallback(OsTime, void ()())+7> and ends at 0x400d2a0a. (gdb) info line 0x400d0ed9 Line 68 of "src\main.cpp" starts at address 0x400d0eca <onEvent(EventType)+98> and ends at 0x400d0ede <onEvent(EventType)+118>. (gdb) info line 0x400d21da Line 135 of ".pio\libdeps\ttgo-lora32-v1\LMICPP-Arduino\src\lmic\lmic.cpp" starts at address 0x400d21da <Lmic::reportEvent(EventType)+10> and ends at 0x400d21dd <Lmic::reportEvent(EventType)+13>. (gdb) info line 0x400d238d Line 910 of ".pio\libdeps\ttgo-lora32-v1\LMICPP-Arduino\src\lmic\lmic.cpp" starts at address 0x400d2387 <Lmic::processDnData()+27> and ends at 0x400d2390 <Lmic::processDnData()+36>. (gdb) info line 0x400d24f1 Line 679 of ".pio\libdeps\ttgo-lora32-v1\LMICPP-Arduino\src\lmic\lmic.cpp" starts at address 0x400d24ee <Lmic::processRx2DnData()+146> and ends at 0x400d24f6. (gdb) info line 0x400d2508 Line 640 of ".pio\libdeps\ttgo-lora32-v1\LMICPP-Arduino\src\lmic\lmic.cpp" starts at address 0x400d2508 <Lmic::processRxDnData()+16> and ends at 0x400d250d. (gdb) info line 0x400d2556 Line 1182 of ".pio\libdeps\ttgo-lora32-v1\LMICPP-Arduino\src\lmic\lmic.cpp" starts at address 0x400d2556 <Lmic::wait_end_rx()+70> and ends at 0x400d255d <Lmic::wait_end_rx()+77>. (gdb) info line 0x400ec11c Line 118 of ".pio\libdeps\ttgo-lora32-v1\LMICPP-Arduino\src\lmic../aes/../lmic/oslmic.h" starts at address 0x400ec107 <OsJobType::call() const+3> and ends at 0x400ec124 <Lmic::setDrJoin(unsigned char)>. (gdb) info line 0x400d2a3e Line 81 of ".pio\libdeps\ttgo-lora32-v1\LMICPP-Arduino\src\lmic\oslmic.cpp" starts at address 0x400d2a3a <OsScheduler::runloopOnce()+46> and ends at 0x400d2a44 <os_init()>. (gdb) info line 0x400d0fea Line 134 of "src\main.cpp" starts at address 0x400d0fe7 <loop()+3> and ends at 0x400d0fed <loop()+9>. (gdb) info line 0x400d494d Line 19 of "C:\Users\Marcio.platformio\packages\framework-arduinoespressif32\cores\esp32\main.cpp" starts at address 0x400d494d <loopTask(void)+21> and ends at 0x400d4950 <loopTask(void)+24>. (gdb) info line *0x40088215 Line 143 of "/home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port.c" starts at address 0x40088213 <vPortTaskWrapper+3> and ends at 0x40088218 <vPortTaskWrapper+8>. (gdb)

Code extract of main.cpp: ... 66 // we have transmit 67 // Schedule next transmission 68 sendjob.setTimedCallback(os_getTime() + TX_INTERVAL, do_send); 69 break; ... 133 void loop() { 134 OsDeltaTime to_wait = OSS.runloopOnce(); 135 if (to_wait > OsDeltaTime(0)) { ...

Márcio

ngraziano commented 4 years ago

The problem should be solve we the last commit (I finaly was able to reproduce).

marciolm commented 4 years ago

Hello Nicolas, I patched the library and no crashes happened anymore. Thanks a lot. One thing, when there is a confirmed downlink message, just after the uplink message transmission, one extra uplink message with next counter number, no payload, and port "0" is transmitted. I attached the picture of the TTN console. For unconfirmed messages, this behavior does not happen. Márcio 2020-02-15 14_07_58-The Things Network Console

ngraziano commented 4 years ago

The Uplink with no data is normal.
The library is sending a soon as possible (with respect to duty cycle) a frame with no data and the ack flag set. The ack and the frame Number 10 are part of the same uplink frame.

If you don't want this frame you can modify the library but the ack will be send only next time you send data. To do so, edit lmic.cpp at line 419 and change code to the following

  // DN frame requested confirmation - provide ACK once with next UP frame
  dnConf = ftype == mhdr::ftype_data_conf_down ? FCT_ACK : 0;
  if (fct & FCT_MORE)
    opmode.set(OpState::POLL);

the library will not generate frame if you send nothing. Nicolas

marciolm commented 4 years ago

Thanks for the explanation and for this great library. Marcio