espressif / esp-mesh-lite

A lite version Wi-Fi Mesh, each node can access the network over the IP layer.
127 stars 20 forks source link

Board resets when doing a speedtest (AEGHB-221) #9

Open redfast00 opened 1 year ago

redfast00 commented 1 year ago

(while working on #8, I encountered the following issue) I'm using the mesh_local_control example, on 2 ESP32s2 boards. The first is connected to the router, the second to the first, and my phone to the network of the second. I have network access on my phone, but when I run a speedtest (via https://speedtest.ugent.be), the second board resets with:

E (216807) task_wdt: Task watchdog got triggered. The following tasks/users did not reset the watchdog in time:
E (216807) task_wdt:  - IDLE (CPU 0)
E (216807) task_wdt: Tasks currently running:
E (216807) task_wdt: CPU 0: wifi
E (216807) task_wdt: Aborting.
E (216807) task_wdt: Print CPU 0 (current core) backtrace

Backtrace: 0x400cff5d:0x3ffd5e80 0x400d00c2:0x3ffd5ea0 0x40031aa5:0x3ffd5ec0 0x400322d3:0x3ffd5ef0 0x4003151e:0x3ffd5f10 0x40031539:0x3ffd5f40 0x40031705:0x3ffd5f60 0x400cd03f:0x3ffd5fa0 0x400cd7c3:0x3ffd5fd0 0x400328aa:0x3ffd6010 0x4002d131:0x3ffd6040

This is with the unmodified code of b32e3e1. I realise that this is likely not enough information to debug/fix this, so I'll follow any instructions to get more debug information to help diagnose this issue.

redfast00 commented 1 year ago

I was able to reproduce this issue with just one ESP32-­S2­-WROOM board, connected to a Wi-Fi AP. My phone is connected to the AP provided by the ESP32-S2-WROOM board, running the example mesh_local_control with the latest code on the master branch (0830a140d893528cffc90ef985eefbf04ab4d0bb). Below is the entire log, obtained via the Monitor feature. This also decoded the addresses in the backtrace, but this doesn't seem to be immediately useful. Note that I redacted the SSID and password of the access point, they were replaced with REDACTED. This is still with https://speedtest.ugent.be, but I'm also able to reproduce this with https://speedtest.net. I get about 4 Mbps down before the ESP32 restarts.

ESP-ROM:esp32s2-rc4-20191025
Build:Oct 25 2019
rst:0x1 (POWERON),boot:0x8 (SPI_FAST_FLASH_BOOT)
SPIWP:0xee
mode:DIO, clock div:1
load:0x3ffe6108,len:0x1764
load:0x4004c000,len:0xabc
load:0x40050000,len:0x31b4
entry 0x4004c1c0
I (21) boot: ESP-IDF v5.0.1-dirty 2nd stage bootloader
I (21) boot: compile time 21:49:30
I (21) boot: chip revision: v0.0
I (24) boot.esp32s2: SPI Speed      : 80MHz
I (29) boot.esp32s2: SPI Mode       : DIO
I (34) boot.esp32s2: SPI Flash Size : 4MB
I (39) boot: Enabling RNG early entropy source...
I (44) boot: Partition Table:
I (48) boot: ## Label            Usage          Type ST Offset   Length
I (55) boot:  0 nvs              WiFi data        01 02 00009000 00004000
I (62) boot:  1 otadata          OTA data         01 00 0000d000 00002000
I (70) boot:  2 phy_init         RF data          01 01 0000f000 00001000
I (77) boot:  3 ota_0            OTA app          00 10 00010000 001e0000
I (85) boot:  4 ota_1            OTA app          00 11 001f0000 001e0000
I (92) boot:  5 coredump         Unknown data     01 03 003d0000 00010000
I (100) boot:  6 reserved         Unknown data     01 fe 003e0000 00020000
I (107) boot: End of partition table
I (112) esp_image: segment 0: paddr=00010020 vaddr=3f000020 size=219b8h (137656) map
I (148) esp_image: segment 1: paddr=000319e0 vaddr=3ffc61e0 size=02db0h ( 11696) load
I (151) esp_image: segment 2: paddr=00034798 vaddr=40022000 size=0b880h ( 47232) load
I (165) esp_image: segment 3: paddr=00040020 vaddr=40080020 size=866b8h (550584) map
I (275) esp_image: segment 4: paddr=000c66e0 vaddr=4002d880 size=08954h ( 35156) load
I (294) boot: Loaded app from partition at offset 0x10000
I (294) boot: Disabling RNG early entropy source...
I (306) cache: Instruction cache        : size 8KB, 4Ways, cache line size 32Byte
I (306) cpu_start: Pro cpu up.
I (327) cpu_start: Pro cpu start user code
I (327) cpu_start: cpu freq: 160000000 Hz
I (327) cpu_start: Application information:
I (330) cpu_start: Project name:     mesh_local_control
I (336) cpu_start: App version:      0830a14
I (341) cpu_start: Compile time:     Jun  6 2023 21:49:12
I (347) cpu_start: ELF file SHA256:  12e276d55039a93b...
I (353) cpu_start: ESP-IDF:          v5.0.1-dirty
I (358) cpu_start: Min chip rev:     v0.0
I (363) cpu_start: Max chip rev:     v1.99 
I (368) cpu_start: Chip rev:         v0.0
I (373) heap_init: Initializing. RAM available for dynamic allocation:
I (380) heap_init: At 3FFCCEE8 len 0002F118 (188 KiB): DRAM
I (386) heap_init: At 3FFFC000 len 00003A10 (14 KiB): DRAM
I (392) heap_init: At 3FF9E000 len 00002000 (8 KiB): RTCRAM
I (399) spi_flash: detected chip: generic
I (403) spi_flash: flash io: dio
I (411) cpu_start: Starting scheduler on PRO CPU.
I (433) bridge_common: esp-iot-bridge version: 0.5.0
I (435) wifi:wifi driver task: 3ffd6174, prio:23, stack:6656, core=0
I (437) system_api: Base MAC address is not set
I (442) system_api: read default base MAC address from EFUSE
I (455) wifi:wifi firmware version: 17afb16
I (455) wifi:wifi certification version: v7.0
I (456) wifi:config NVS flash: enabled
I (459) wifi:config nano formating: disabled
I (464) wifi:Init data frame dynamic rx buffer num: 32
I (468) wifi:Init management frame dynamic rx buffer num: 32
I (474) wifi:Init management short buffer num: 32
I (478) wifi:Init dynamic tx buffer num: 32
I (482) wifi:Init static rx buffer size: 1600
I (486) wifi:Init static rx buffer num: 10
I (490) wifi:Init dynamic rx buffer num: 32
I (494) wifi_init: tcpip mbox: 32
I (498) wifi_init: udp mbox: 6
I (501) wifi_init: tcp mbox: 6
I (505) wifi_init: tcp tx win: 5744
I (509) wifi_init: tcp rx win: 5744
I (514) wifi_init: tcp mss: 1440
I (517) wifi_init: WiFi IRAM OP enabled
I (522) wifi_init: WiFi RX IRAM OP enabled
I (527) phy_init: phy_version 2300,d67cf06,Feb 10 2022,10:03:07
I (575) wifi:mode : null
I (576) ip select: IP Address:192.168.4.1
I (576) ip select: GW Address:192.168.4.1
I (576) ip select: NM Address:255.255.255.0
I (580) bridge_wifi: IP Address:192.168.4.1
Add netif ap with 0830a14(commit id)
I (589) bridge_common: netif list add success
I (594) wifi:mode : softAP (7c:df:a1:3c:f8:1b)
I (599) wifi:Total power save buffer number: 16
I (602) wifi:Init max length of beacon: 752/752
I (607) wifi:Init max length of beacon: 752/752
Add netif sta with 0830a14(commit id)
I (614) bridge_common: netif list add success
I (619) wifi:mode : sta (7c:df:a1:3c:f8:1a) + softAP (7c:df:a1:3c:f8:1b)
I (626) wifi:enable tsf
I (629) bridge_wifi: Found ssid REDACTED
I (632) bridge_wifi: Found password REDACTED
I (638) bridge_wifi: sta ssid: REDACTED password: REDACTED
W (645) vendor_ie: Error Get[4354]
W (648) vendor_ie: Error Get[4354]
I (1745) wifi:Total power save buffer number: 16
I (1750) bridge_wifi: softap ssid: ESP_Bridge_3cf81b password: 12345678
I (1750) Mesh-Lite: esp-mesh-lite component version: 0.1.2
Mesh-Lite commit id: 4a6dd88
I (1756) vendor_ie: Mesh ID: 77
W (1759) vendor_ie: Error Get[4354]
W (1763) vendor_ie: Error Get[4354]
I (1768) ESP_Mesh_Lite_Comm: msg action add success
I (1774) ESP_Mesh_Lite_Comm: Bind Socket 54, port 6364
I (1779) ESP_Mesh_Lite_Comm: Bind Socket 55, port 6363
I (1785) ESP_Mesh_Lite_Comm: Bind Socket 56, port 6366
I (1791) ESP_Mesh_Lite_Comm: Bind Socket 57, port 6365
I (1797) Mesh-Lite: Mesh-Lite connecting
I (1802) ESP_Mesh_Lite_Comm: msg action add success
I (4644) vendor_ie: Mesh-Lite Scan done
I (4807) wifi:new:<1,1>, old:<1,1>, ap:<1,1>, sta:<1,0>, prof:1
I (5898) wifi:state: init -> auth (b0)
I (6027) wifi:state: auth -> assoc (0)
I (6038) wifi:state: assoc -> run (10)
I (6051) wifi:connected with REDACTED, aid = 6, channel 1, BW20, bssid = 4c:ed:fb:35:22:a8
I (6051) wifi:security: WPA2-PSK, phy: bgn, rssi: -42
I (6053) wifi:pm start, type: 1

I (6079) wifi:AP's beacon interval = 102400 us, DTIM period = 2
I (7560) esp_netif_handlers: sta ip: 10.1.0.204, mask: 255.0.0.0, gw: 10.0.0.1
I (7561) bridge_wifi: Connected with IP Address:10.1.0.204
I (7564) vendor_ie: RTC store: REDACTED
I (7582) local_control: Create a tcp client, ip: 10.0.0.8, port: 8080
E (10446) local_control: socket connect, ret: -1, ip: 10.0.0.8, port: 8080
I (10448) ESP_Mesh_Lite_Comm: approved: 0
I (10448) local_control: TCP client write task is running
I (11806) local_control: System information, channel: 1, layer: 1, self mac: 7c:df:a1:3c:f8:1a, parent bssid: 4c:ed:fb:35:22:a8, parent rssi: -36, free heap: 125992
I (21806) local_control: System information, channel: 1, layer: 1, self mac: 7c:df:a1:3c:f8:1a, parent bssid: 4c:ed:fb:35:22:a8, parent rssi: -35, free heap: 125992
I (24197) wifi:new:<1,0>, old:<1,1>, ap:<1,1>, sta:<1,0>, prof:1
I (24198) wifi:station: 82:f6:ff:ce:f5:e8 join, AID=1, bgn, 20
I (24238) bridge_wifi: STA Connecting to the AP again...
I (24318) esp_netif_lwip: DHCP server assigned IP to a client, IP is: 192.168.4.2
E (41687) task_wdt: Task watchdog got triggered. The following tasks/users did not reset the watchdog in time:
E (41687) task_wdt:  - IDLE (CPU 0)
E (41687) task_wdt: Tasks currently running:
E (41687) task_wdt: CPU 0: wifi
E (41687) task_wdt: Aborting.
E (41687) task_wdt: Print CPU 0 (current core) backtrace

Backtrace: 0x400cf874:0x3ffd60f0 0x40032861:0x3ffd6110 0x4002d131:0x3ffd6140
0x400cf874: ppProcTxDone at ??:?

0x40032861: ppTask at ??:?

0x4002d131: vPortTaskWrapper at /home/user/esp/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:154

ELF file SHA256: 12e276d55039a93b

Rebooting...
tswen commented 1 year ago

Hi, What version of idf are you using, I need to try to reproduce the problem.

redfast00 commented 1 year ago

@tswen the boot log says ESP-IDF v5.0.1-dirty, is this specific enough?

tswen commented 1 year ago

Okay, we need some time to analyze this issue and we'll try to be able to send a debug log version of the wifi lib tomorrow.

tswen commented 1 year ago

esp32s2_314a0864.zip You can replace this library with the one located at components/esp_wifi/lib/esp32s2 in the IDF (ESP-IDF) framework.

redfast00 commented 1 year ago

I'm still able to reproduce the issue, and it doesn't seem like there are any extra debug log statements. I replaced the libraries as specified in your previous comment, used 'ESP-IDF Full Clean' in VS code and then rebuilt and reflashed the firmware. So two questions:

Since the speedtest sites might behave differently in China, I developed a way to reproduce this without the external internet: on a local machine in the upstream wifi network, run iperf3 -s. Then, on a laptop connected to ESP_Bridge_... network, run iperf3 -c <ip_address_of_the_iperf_server> -i 0. The -i 0 might be useful here, because it reduces the time interval between downloads to 0.

ESP-ROM:esp32s2-rc4-20191025
Build:Oct 25 2019
rst:0x1 (POWERON),boot:0x8 (SPI_FAST_FLASH_BOOT)
SPIWP:0xee
mode:DIO, clock div:1
load:0x3ffe6108,len:0x1764
load:0x4004c000,len:0xabc
load:0x40050000,len:0x31b4
entry 0x4004c1c0
I (21) boot: ESP-IDF v5.0.1-dirty 2nd stage bootloader
I (21) boot: compile time 13:35:59
I (21) boot: chip revision: v0.0
I (24) boot.esp32s2: SPI Speed      : 80MHz
I (29) boot.esp32s2: SPI Mode       : DIO
I (34) boot.esp32s2: SPI Flash Size : 4MB
I (39) boot: Enabling RNG early entropy source...
I (44) boot: Partition Table:
I (48) boot: ## Label            Usage          Type ST Offset   Length
I (55) boot:  0 nvs              WiFi data        01 02 00009000 00004000
I (62) boot:  1 otadata          OTA data         01 00 0000d000 00002000
I (70) boot:  2 phy_init         RF data          01 01 0000f000 00001000
I (77) boot:  3 ota_0            OTA app          00 10 00010000 001e0000
I (85) boot:  4 ota_1            OTA app          00 11 001f0000 001e0000
I (92) boot:  5 coredump         Unknown data     01 03 003d0000 00010000
I (100) boot:  6 reserved         Unknown data     01 fe 003e0000 00020000
I (107) boot: End of partition table
I (112) esp_image: segment 0: paddr=00010020 vaddr=3f000020 size=21a30h (137776) map
I (148) esp_image: segment 1: paddr=00031a58 vaddr=3ffc61e0 size=02db0h ( 11696) load
I (151) esp_image: segment 2: paddr=00034810 vaddr=40022000 size=0b808h ( 47112) load
I (165) esp_image: segment 3: paddr=00040020 vaddr=40080020 size=867ech (550892) map
I (275) esp_image: segment 4: paddr=000c6814 vaddr=4002d808 size=089cch ( 35276) load
I (294) boot: Loaded app from partition at offset 0x10000
I (294) boot: Disabling RNG early entropy source...
I (306) cache: Instruction cache        : size 8KB, 4Ways, cache line size 32Byte
I (306) cpu_start: Pro cpu up.
I (327) cpu_start: Pro cpu start user code
I (327) cpu_start: cpu freq: 160000000 Hz
I (327) cpu_start: Application information:
I (330) cpu_start: Project name:     mesh_local_control
I (336) cpu_start: App version:      0830a14
I (341) cpu_start: Compile time:     Jun  9 2023 13:35:49
I (347) cpu_start: ELF file SHA256:  9d3c1bc8b0c8b4af...
I (353) cpu_start: ESP-IDF:          v5.0.1-dirty
I (358) cpu_start: Min chip rev:     v0.0
I (363) cpu_start: Max chip rev:     v1.99 
I (368) cpu_start: Chip rev:         v0.0
I (373) heap_init: Initializing. RAM available for dynamic allocation:
I (380) heap_init: At 3FFCCEE8 len 0002F118 (188 KiB): DRAM
I (386) heap_init: At 3FFFC000 len 00003A10 (14 KiB): DRAM
I (392) heap_init: At 3FF9E000 len 00002000 (8 KiB): RTCRAM
I (399) spi_flash: detected chip: generic
I (403) spi_flash: flash io: dio
I (411) cpu_start: Starting scheduler on PRO CPU.
I (433) bridge_common: esp-iot-bridge version: 0.5.0
I (435) wifi:wifi driver task: 3ffd6174, prio:23, stack:6656, core=0
I (437) system_api: Base MAC address is not set
I (442) system_api: read default base MAC address from EFUSE
I (455) wifi:wifi firmware version: 314a086
I (455) wifi:wifi certification version: v7.0
I (456) wifi:config NVS flash: enabled
I (459) wifi:config nano formating: disabled
I (464) wifi:Init data frame dynamic rx buffer num: 32
I (468) wifi:Init management frame dynamic rx buffer num: 32
I (474) wifi:Init management short buffer num: 32
I (478) wifi:Init dynamic tx buffer num: 32
I (482) wifi:Init static rx buffer size: 1600
I (486) wifi:Init static rx buffer num: 10
I (490) wifi:Init dynamic rx buffer num: 32
I (494) wifi_init: tcpip mbox: 32
I (498) wifi_init: udp mbox: 6
I (501) wifi_init: tcp mbox: 6
I (505) wifi_init: tcp tx win: 5744
I (509) wifi_init: tcp rx win: 5744
I (514) wifi_init: tcp mss: 1440
I (517) wifi_init: WiFi IRAM OP enabled
I (522) wifi_init: WiFi RX IRAM OP enabled
I (527) phy_init: phy_version 2300,d67cf06,Feb 10 2022,10:03:07
I (574) wifi:mode : null
I (575) ip select: IP Address:192.168.4.1
I (575) ip select: GW Address:192.168.4.1
I (575) ip select: NM Address:255.255.255.0
I (579) bridge_wifi: IP Address:192.168.4.1
Add netif ap with 0830a14(commit id)
I (588) bridge_common: netif list add success
I (593) wifi:mode : softAP (7c:df:a1:3c:f8:1b)
I (598) wifi:Total power save buffer number: 16
I (601) wifi:Init max length of beacon: 752/752
I (606) wifi:Init max length of beacon: 752/752
Add netif sta with 0830a14(commit id)
I (613) bridge_common: netif list add success
I (618) wifi:mode : sta (7c:df:a1:3c:f8:1a) + softAP (7c:df:a1:3c:f8:1b)
I (625) wifi:enable tsf
I (628) bridge_wifi: Found ssid REDACTED
I (631) bridge_wifi: Found password REDACTED
I (637) bridge_wifi: sta ssid: REDACTED password: REDACTED
W (644) vendor_ie: Error Get[4354]
W (647) vendor_ie: Error Get[4354]
I (1760) wifi:Total power save buffer number: 16
I (1764) bridge_wifi: softap ssid: ESP_Bridge_3cf81b password: 12345678
I (1765) Mesh-Lite: esp-mesh-lite component version: 0.1.2
Mesh-Lite commit id: 4a6dd88
I (1770) vendor_ie: Mesh ID: 77
W (1774) vendor_ie: Error Get[4354]
W (1778) vendor_ie: Error Get[4354]
I (1783) ESP_Mesh_Lite_Comm: msg action add success
I (1789) ESP_Mesh_Lite_Comm: Bind Socket 54, port 6364
I (1794) ESP_Mesh_Lite_Comm: Bind Socket 55, port 6363
I (1800) ESP_Mesh_Lite_Comm: Bind Socket 56, port 6366
I (1805) ESP_Mesh_Lite_Comm: Bind Socket 57, port 6365
I (1812) Mesh-Lite: Mesh-Lite connecting
I (1816) ESP_Mesh_Lite_Comm: msg action add success
I (4661) vendor_ie: Mesh-Lite Scan done
I (4818) wifi:new:<1,1>, old:<1,1>, ap:<1,1>, sta:<1,0>, prof:1
I (5924) wifi:state: init -> auth (b0)
I (6026) wifi:state: auth -> assoc (0)
I (6030) wifi:state: assoc -> run (10)
I (6040) wifi:connected with REDACTED, aid = 7, channel 1, BW20, bssid = 4c:ed:fb:35:22:a8
I (6040) wifi:security: WPA2-PSK, phy: bgn, rssi: -39
I (6042) wifi:pm start, type: 1

I (6092) wifi:AP's beacon interval = 102400 us, DTIM period = 2
I (7550) esp_netif_handlers: sta ip: 10.1.0.204, mask: 255.0.0.0, gw: 10.0.0.1
I (7551) bridge_wifi: Connected with IP Address:10.1.0.204
I (7554) vendor_ie: RTC store: REDACTED
I (7572) local_control: Create a tcp client, ip: 10.0.0.8, port: 8080
E (10437) local_control: socket connect, ret: -1, ip: 10.0.0.8, port: 8080
I (10439) ESP_Mesh_Lite_Comm: approved: 0
I (10439) local_control: TCP client write task is running
I (11821) local_control: System information, channel: 1, layer: 1, self mac: 7c:df:a1:3c:f8:1a, parent bssid: 4c:ed:fb:35:22:a8, parent rssi: -39, free heap: 125980
I (21821) local_control: System information, channel: 1, layer: 1, self mac: 7c:df:a1:3c:f8:1a, parent bssid: 4c:ed:fb:35:22:a8, parent rssi: -40, free heap: 125980
I (28962) wifi:new:<1,0>, old:<1,1>, ap:<1,1>, sta:<1,0>, prof:1
I (28963) wifi:station: 82:f6:ff:ce:f5:e8 join, AID=1, bgn, 20
I (29001) bridge_wifi: STA Connecting to the AP again...
I (29074) esp_netif_lwip: DHCP server assigned IP to a client, IP is: 192.168.4.2
I (31821) local_control: System information, channel: 1, layer: 1, self mac: 7c:df:a1:3c:f8:1a, parent bssid: 4c:ed:fb:35:22:a8, parent rssi: -42, free heap: 124004
I (31826) local_control: Child mac: 82:f6:ff:ce:f5:e8
I (46269) local_control: System information, channel: 1, layer: 1, self mac: 7c:df:a1:3c:f8:1a, parent bssid: 4c:ed:fb:35:22:a8, parent rssi: -46, free heap: 91692
I (46273) local_control: Child mac: 82:f6:ff:ce:f5:e8
I (53373) local_control: System information, channel: 1, layer: 1, self mac: 7c:df:a1:3c:f8:1a, parent bssid: 4cI (69923) local_control: System information, channel: 1, layer: 1, self mac: 7c:df:a1:3c:f8:1a, parent bssid: 4c:ed:fb:35:22:a8, parenE (70308) task_wdt: Task watchdog got triggered. The following tasks/users did not reset the watchdog in time:
E (70308) task_wdt:  - IDLE (CPU 0)
E (70308) task_wdt: Tasks currently running:
E (70308) task_wdt: CPU 0: tiT
E (70308) task_wdt: Aborting.
E (70308) task_wdt: Print CPU 0 (current core) backtrace

Backtrace: 0x401026fd:0x3ffd2bb0 0x400a1db7:0x3ffd2bd0 0x400a20ff:0x3ffd2bf0 0x400a7bce:0x3ffd2c10 0x40096ed9:0x3ffd2c30 0x40096f50:0x3ffd2c50 0x4002d131:0x3ffd2c80
0x401026fd: ip4_addr_isbroadcast_u32 at /home/user/esp/esp-idf/components/lwip/lwip/src/core/ipv4/ip4_addr.c:74

0x400a1db7: ip4_input_accept at /home/user/esp/esp-idf/components/lwip/lwip/src/core/ipv4/ip4.c:413 (discriminator 1)

0x400a20ff: ip4_input at /home/user/esp/esp-idf/components/lwip/lwip/src/core/ipv4/ip4.c:581

0x400a7bce: ethernet_input at /home/user/esp/esp-idf/components/lwip/lwip/src/netif/ethernet.c:186

0x40096ed9: tcpip_thread_handle_msg at /home/user/esp/esp-idf/components/lwip/lwip/src/api/tcpip.c:174

0x40096f50: tcpip_thread at /home/user/esp/esp-idf/components/lwip/lwip/src/api/tcpip.c:148

0x4002d131: vPortTaskWrapper at /home/user/esp/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:154

ELF file SHA256: 9d3c1bc8b0c8b4af

Rebooting...
wujiangang commented 1 year ago

@redfast00 Sorry that it blocked you, I tried your new way, but still couldn't reproduce this issue. I only got about 1.5Mbps in my side, so tcpip and wifi task may not occupy too much CPU. Could you please turn off CONFIG_ESP_TASK_WDT_PANIC, and to get a rough throughput? And since esp-mesh-lite is based on esp-iot-bridge, whether you can try your test case by using esp-iot-bridge wifi-route sample?

redfast00 commented 1 year ago

@wujiangang I'd like to thank you for the quick follow-ups each times, I appreciate that Espressif is invested to fix this issue.

I noticed that the watchdog crash does not always get triggered with a speed test, this is only sometimes so.

This is with CONFIG_ESP_TASK_WDT still turned on (so with watchdog enabled).

Downloading from server to client, this does not always crash.

$ iperf3 -c 10.0.0.8 -R
Connecting to host 10.0.0.8, port 5201
Reverse mode, remote host 10.0.0.8 is sending
[  5] local 192.168.4.2 port 54000 connected to 10.0.0.8 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   561 KBytes  4.60 Mbits/sec                  
[  5]   1.00-2.00   sec   611 KBytes  5.00 Mbits/sec                  
[  5]   2.00-3.00   sec   561 KBytes  4.60 Mbits/sec                  
[  5]   3.00-4.00   sec   611 KBytes  5.00 Mbits/sec                  
[  5]   4.00-5.00   sec   600 KBytes  4.91 Mbits/sec                  
[  5]   5.00-6.00   sec   636 KBytes  5.21 Mbits/sec                  
[  5]   6.00-7.00   sec   624 KBytes  5.11 Mbits/sec                  
[  5]   7.00-8.00   sec   629 KBytes  5.16 Mbits/sec                  
[  5]   8.00-9.00   sec   591 KBytes  4.84 Mbits/sec                  
[  5]   9.00-10.00  sec   667 KBytes  5.47 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.04  sec  6.16 MBytes  5.15 Mbits/sec   29             sender
[  5]   0.00-10.00  sec  5.95 MBytes  4.99 Mbits/sec                  receiver

Uploading from server to client also doesn't always immediately crash:

Connecting to host 10.0.0.8, port 5201
[  5] local 192.168.4.2 port 38212 connected to 10.0.0.8 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   843 KBytes  6.90 Mbits/sec    1   53.7 KBytes       
[  5]   1.00-2.00   sec   636 KBytes  5.21 Mbits/sec    8   52.3 KBytes       
[  5]   2.00-3.00   sec   636 KBytes  5.22 Mbits/sec    5   43.8 KBytes       
[  5]   3.00-4.00   sec   509 KBytes  4.17 Mbits/sec    4   35.4 KBytes       
[  5]   4.00-5.00   sec   636 KBytes  5.21 Mbits/sec    0   48.1 KBytes       
[  5]   5.00-6.00   sec   636 KBytes  5.21 Mbits/sec    0   55.1 KBytes       
[  5]   6.00-7.00   sec   573 KBytes  4.69 Mbits/sec    8   46.7 KBytes       
[  5]   7.00-8.00   sec   636 KBytes  5.21 Mbits/sec    6   38.2 KBytes       
[  5]   8.00-9.00   sec   636 KBytes  5.21 Mbits/sec    0   46.7 KBytes 
(CRASH)      
redfast00 commented 1 year ago

Strangely enough, I'm not able to reproduce the issue using the esp-iot-bridge wifi-route example.

redfast00 commented 1 year ago

I tried to reproduce this on a different host network (different upstream Wi-Fi access point) and was not able to reproduce this anymore. Tomorrow, I'll check if I'm still able to reproduce the issue on the original AP.

redfast00 commented 1 year ago

Yes, I'm still able to reproduce this on the original AP. I guess the issue is somehow dependent on the upstream AP.

wujiangang commented 1 year ago

Which AP are you using? I can check if we have the same one or a similar model. Regarding the triggered WDT caused by high CPU usage from Wi-Fi and TCP/IP operations, this could be due to a high throughput. Can you put this AP behind a wall to weaken the signal? This may result in a drop in throughput, I want to know whether the WDT will be trigged in this situation.

redfast00 commented 1 year ago

We're using an ASUS RT-AC58U, with OpenWRT flashed on it (OpenWrt 22.03.2 r19803-9a599fee93 / LuCI openwrt-22.03 branch git-22.288.45147-96ec0cd).

It's hard to put the AP behind a wall in our current setup, but I tried to do something else to attenuate the signal: when I remove the antenna from the ESP32S2 (it's an ESP32S2 with an external antenna), I get a lower throughput when using iperf (only 3-4 Mbps instead of the 4-5 Mbps) and the watchdog crash does not occur.

wujiangang commented 1 year ago

Thank you for your feedback. We will reproduce this issue in a shielded box to achieve a high throughput for testing. @tswen Please test and give some analysis.

tswen commented 1 year ago

Sure, I will try to see if I can reproduce the issue. In addition @redfast00, do you have an ESP32 or any other development board there? Have you tried using a different series of development boards to test? Because a previous client encountered a similar issue on an ESP32 but was able to resolve it by using an internally fixed library esp32_314a0864.zip based on v5.0.1. I'm not sure if these two issues are the same or if they are related to different chips.

redfast00 commented 1 year ago

I'm afraid I don't anymore, all my ESP32's are in deployed projects, I only have ESP32S2 dev boards with external antenna left.

Edit: specifically the ESP32-S2-Saola-1 with the ESP32-S2-WROOM-I

redfast00 commented 1 year ago

@tswen @wujiangang have you been able to reproduce this issue?

tswen commented 1 year ago

esp32s2_85204a31_v5.0.1.zip Can you please try using this library again? It is based on idf/v5.0.1 and includes some fixes. Could you kindly verify if the issue persists with this version?

tswen commented 1 year ago

Hello, can you still reproduce the issue after replacing the new wifi lib?

redfast00 commented 1 year ago

Thank you for the updated library. I have not had any time to test it yet unfortunately. I still want to get this working, but it will likely be after the beginning of September.