sudomesh / disaster-radio

A (paused) work-in-progress long-range, low-bandwidth wireless disaster recovery mesh network powered by the sun.
https://disaster.radio
1.06k stars 107 forks source link

device restart when input any command from serial port #77

Closed yuchangyuan closed 4 years ago

yuchangyuan commented 4 years ago

Device will restart after input any command.

firmware revision: 44cf2dfc75a68d6309bbc8a20345a3e2ef2dee47 hardware: TTGOv2

below is log for platformio device monitor with /help command:

--- Available filters and text transformations: colorize, debug, default, direct, esp32_exception_decoder, hexlify, log2file, nocontrol, printable, send_on_enter, time
--- More details at http://bit.ly/pio-monitor-filters
--- Miniterm on /dev/ttyUSB0  115200,8,N,1 ---
--- Quit: Ctrl+C | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---

     ___              __                            ___
 ___/ (_)__ ___ ____ / /____ ____      _______ ____/ (_)__
/ _  / (_-</ _ `(_-</ __/ -_) __/ _   / __/ _ `/ _  / / _ \
\_,_/_/___/\_,_/___/\__/\__/_/   (_) /_/  \_,_/\_,_/_/\___/
v1.0.0-rc.2
Local address of your node is ���?h��?8xV��V�����?4␒��␐
Type '/join NICKNAME' to join the chat, or '/help' for more commands.
< > /help
Commands: /help /join /nick /raw /lora /set /restart

Stack smashing protect failure!

abort() was called at PC 0x400f8d93 on core 1

Backtrace: 0x400929ec:0x3ffd1e50 0x40092c1d:0x3ffd1e70 0x400f8d93:0x3ffd1e90 0x400d659a:0x3ffd1eb0 0x400d65cb:0x3ffd2100 0x400d65fb:0x3ffd2120 0x400d4062:0x3ffd2240 0x402106d3:0x3ffd2480 0x402106d3:0x3ffd24a0 0x402106d3:0x3ffd24c0 0x400d670a:0x3ffd24e0 0x400d32c9:0x3ffd2500 0x400f5141:0x3ffd2520 0x4008f101:0x3ffd2540

Rebooting...
ets Jun  8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 188777542, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:1044
load:0x40078000,len:8896
load:0x40080400,len:5828
entry 0x400806ac
Got username n2
Got IF setting BLE
* Initializing SD card...
[E][sd_diskio.cpp:739] sdcard_mount(): f_mount failed 0x(3)
 --> SD card not found
* Initializing SPIFFS...
E (547) SPIFFS: mount failed, -10025
[E][SPIFFS.cpp:72] begin(): Mounting SPIFFS failed! Error: -1
-* Initializing serial...
* Initializing BLE...
* Initializing LoRa...
 --> Layer1 init succeeded
 --> LL2 init succeeded
 --> LL2 node address: 918c1b8c
* Initializing display...
WelcomeMessage username: >n2<

00c|Welcome to DISASTER RADIO >n2<< >
00c|WARNING: SD card not found, functionality may be limited< >
paidforby commented 4 years ago

Weird, I thought I tested this. I may have introduced a bug in 9d137b4f5f8d88131f2c0b4dd2e638cf73cfec5a. Or in another commit when I was making changes to make the console shared between the dev board and the simulator. I'll look into this when I get a chance.

deafboy commented 4 years ago

I had to revert even further, to commit fbf2187da4e0e6828c11929be720f3ab243bdd40, to avoid this problem.

paidforby commented 4 years ago

Hmm, maybe it's more likely that this commit https://github.com/sudomesh/disaster-radio/commit/43075131149f1456c2364c3cf8fbb5121cefae65 which updated LoRaLayer2 caused this bug? Or maybe try updating to the latest commit of LoRaLayer2 (https://github.com/sudomesh/LoRaLayer2/commit/b071e53ceb6d2530fa2de28019165fd7534a378f), not sure where I left off there, but it's possible I neglected to update the commit in platformio.ini.

rawesomeawesome commented 4 years ago

I've managed to compile the platform myself, having to remove tinygps to make it work for whatever reason, but i'm now facing this issue. The only reason i was unable to use the precompiled bin which didn't do this, is that the frequency was set wrong for my boards, which are cheaper 433mhz models Solved it, the issue seems to have been introduced between v1.0.0 and master, so compiling v1.0.0 without tinygps worked perfectly

paidforby commented 4 years ago

I found the cause of this bug. The culprit is the following line in src/middleware/Console.cpp,

memset(response.message, 0, DATAGRAM_MESSAGE);

Looks like I forgot to update the DATAGRAM_MESSAGE constant to match with the updates to LoRaLayer2. This was causing a buffer overflow during this memset and triggering a stack smashing failure. The quick solution is to update DATAGRAM_MESSAGE to be the correct number (i.e. 238). The right solution is to move this constant in to LoRaLayer2.h and use that instead, since that is where all routing protocol related changes are made.

Sorry about this problem, I've previously struggled with how to architecture this project. That DATAGRAM_MESSAGE constant is a leftover from when I defined the datagram struct in the Disaster Radio code and not in LoRaLayer2.

I'll push a commit shortly that will make the quick fix. I still need to test the right solution of moving DATAGRAM_MESSAGE to LoRaLayer2.h to make sure it doesn't have any unintended side effects.

paidforby commented 4 years ago

I should also point out that I was only memsetting the message portion of the response datagram, which is wrong and would cause a buffer overflow, because the DATAGRAM_MESSAGE constant represents the total length of the datagram (i.e. header+message). Anyway, I'll fix that also, by memsetting the whole datagram.