esp-rs / esp-wifi-sys

Wi-Fi and BT drivers packaged for integration into bare-metal esp-wifi.
Apache License 2.0
395 stars 92 forks source link

[ESP32] esp-wifi crash if used with embassy and any task on the second core #412

Closed liebman closed 6 months ago

liebman commented 8 months ago

On esp32 if any embassy task is started on the second core it crashes.

Code to reproduce is here in this gist

crashes with:

rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:7104
load:0x40078000,len:15576
load:0x40080400,len:4
ho 8 tail 4 room 4
load:0x40080404,len:3876
entry 0x4008064c
I (31) boot: ESP-IDF v5.1-beta1-378-gea5e0ff298-dirt 2nd stage bootloader
I (31) boot: compile time Jun  7 2023 07:48:23
I (33) boot: Multicore bootloader
I (37) boot: chip revision: v0.0
I (41) boot.esp32: SPI Speed      : 40MHz
I (46) boot.esp32: SPI Mode       : DIO
I (50) boot.esp32: SPI Flash Size : 4MB
I (55) boot: Enabling RNG early entropy source...
I (60) boot: Partition Table:
I (64) boot: ## Label            Usage          Type ST Offset   Length
I (71) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (78) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (86) boot:  2 factory          factory app      00 00 00010000 003f0000
I (93) boot: End of partition table
I (98) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=17bf4h ( 97268) map
I (141) esp_image: segment 1: paddr=00027c1c vaddr=3ffb0000 size=00e0ch (  3596) load
I (143) esp_image: segment 2: paddr=00028a30 vaddr=3ffbcd04 size=00190h (   400) load
I (147) esp_image: segment 3: paddr=00028bc8 vaddr=40080000 size=07450h ( 29776) load
I (168) esp_image: segment 4: paddr=00030020 vaddr=400d0020 size=581c0h (360896) map
I (298) esp_image: segment 5: paddr=000881e8 vaddr=40087450 size=0398ch ( 14732) load
I (310) boot: Loaded app from partition at offset 0x10000
I (310) boot: Disabling RNG early entropy source...
Starting enable_disable_led() on core 0
Sending LED on
Starting control_led() on core 1
LED on
start connection task
Device capabilities: Ok(EnumSet(Client))
making wifi client config
set wifi configuration

Exception occured 'StoreProhibited'
Context
PC=0x400ff616       PS=0x00060510
A0=0x800f7caa       A1=0x3ffb62d0       A2=0x00000003       A3=0x3ffbb134       A4=0x00000000
A5=0x3ffe43c8       A6=0x00000001       A7=0x3ffdb8e0       A8=0x800ff5c4       A9=0x3ffb62b0
A10=0x00000000      A11=0x3ffe43ec      A12=0x00000003      A13=0x3f414309      A14=0x00000000
SAR=00000008
EXCCAUSE=0x0000001d EXCVADDR=0x00000003
LBEG=0x40001609     LEND=0x4000160d     LCOUNT=0x00000000
THREADPTR=0x00000000
SCOMPARE1=0x00000001
BR=0x00000000
ACCLO=0x00000000    ACCHI=0x00000000
M0=0x00000000       M1=0x00000000       M2=0x00000000       M3=0x00000000
F64R_LO=0x00000000  F64R_HI=0x00000000  F64S=0x00000000
FCR=0x00000000      FSR=0x00000000
F0=0x00000000       F1=0x00000000       F2=0x00000000       F3=0x00000000       F4=0x00000000
F5=0x00000000       F6=0x00000000       F7=0x00000000       F8=0x00000000       F9=0x00000000
F10=0x00000000      F11=0x00000000      F12=0x00000000      F13=0x00000000      F14=0x00000000
F15=0x00000000

0x400fa22c
0x400fa22c - wifi_set_config_process
    at ??:??
0x400f7bcd
0x400f7bcd - ieee80211_ioctl_process
    at ??:??
0x400838d4
0x400838d4 - ppTask
    at ??:??
0x400e757c
0x400e757c - core::sync::atomic::atomic_load
    at /Users/chris.l/.rustup/toolchains/esp/lib/rustlib/src/rust/library/core/src/sync/atomic.rs:3288
0x40000000
0x40000000 - _ZN8esp_wifi9HEAP_DATA17h68314300b948c320E
    at ??:??
bjoernQ commented 8 months ago

Interesting thing here is that it works fine on ESP32-S3

bjoernQ commented 8 months ago

Even running just an empty loop on the second core makes it crash. Also, if the second core was just running before anything else for a very short time it crashes later

bjoernQ commented 8 months ago

This is really weird - I can make it not crash anymore by commenting out https://github.com/esp-rs/esp-wifi/blob/ce4264907916bc3c1a3406075c9d3c0e05cb89d3/esp-wifi/src/lib.rs#L144 - i.e. NOT placing the wifi heap in dram2 segment

While this solves the issue I have no idea why - @MabezDev any idea what might be special about dram2_segment?

bjoernQ commented 8 months ago

I tried something: In esp-hal multicore example I zero all bytes in dram2_segment, run the code and check if those bytes are still zero ..... they are not. There are at least ~16k of non-zero bytes in the beginning of that segment.

Unfortunately, just letting dram2_segment start at a 16k higher address doesn't solve the problems here

MabezDev commented 6 months ago

@liebman When you get a chance, could you test the linked PR and see if it fixes your issue?

liebman commented 6 months ago

Still crashes :-(

rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:7104
0x3fff0030 - esp_wifi::HEAP_DATA
    at ??:??
load:0x40078000,len:15576
load:0x40080400,len:4
ho 8 tail 4 room 4
load:0x40080404,len:3876
entry 0x4008064c
0x4008064c - core::fmt::Arguments::new_v1
    at /Users/chris.l/.rustup/toolchains/esp/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:335
I (31) boot: ESP-IDF v5.1-beta1-378-gea5e0ff298-dirt 2nd stage bootloader
I (31) boot: compile time Jun  7 2023 07:48:23
I (33) boot: Multicore bootloader
I (37) boot: chip revision: v0.0
I (41) boot.esp32: SPI Speed      : 40MHz
I (45) boot.esp32: SPI Mode       : DIO
I (50) boot.esp32: SPI Flash Size : 4MB
I (55) boot: Enabling RNG early entropy source...
I (60) boot: Partition Table:
I (63) boot: ## Label            Usage          Type ST Offset   Length
I (71) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (78) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (86) boot:  2 factory          factory app      00 00 00010000 003f0000
I (93) boot: End of partition table
I (97) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=186a8h (100008) map
I (142) esp_image: segment 1: paddr=000286d0 vaddr=3ffb0000 size=00e14h (  3604) load
I (144) esp_image: segment 2: paddr=000294ec vaddr=3ffbcffc size=00190h (   400) load
I (148) esp_image: segment 3: paddr=00029684 vaddr=40080000 size=06994h ( 27028) load
I (168) esp_image: segment 4: paddr=00030020 vaddr=400d0020 size=57d04h (359684) map
I (298) esp_image: segment 5: paddr=00087d2c vaddr=40086994 size=06304h ( 25348) load
I (315) boot: Loaded app from partition at offset 0x10000
I (315) boot: Disabling RNG early entropy source...
Starting enable_disable_led() on core 0
Sending LED on
Starting control_led() on core 1
LED on
start connection task
Device capabilities: Ok(EnumSet(Client))
making wifi client config
set wifi configuration

Exception occured 'StoreProhibited'
Context
PC=0x400ff926       PS=0x00060510
0x400ff926 - wifi_nvs_set
    at ??:??
A0=0x800f79a1       A1=0x3ffb6040       A2=0xffff0003       A3=0x3ffbb464       A4=0x00000000
0x3ffb6040 - esp_wifi::preempt::TASK_STACK
    at ??:??
0x3ffbb464 - s_wifi_nvs
    at ??:??
A5=0x3ffe43c8       A6=0x00000001       A7=0x00000000       A8=0x800ff8d4       A9=0x3ffb6020
0x3ffe43c8 - esp_wifi::HEAP_DATA
    at ??:??
0x3ffb6020 - esp_wifi::preempt::TASK_STACK
    at ??:??
A10=0x00000000      A11=0x00000001      A12=0x800e7231      A13=0x3f414d55      A14=0x00000000
A15=0x00060500
SAR=00000008
EXCCAUSE=0x0000001d EXCVADDR=0xffff0003
LBEG=0x4000c349     LEND=0x4000c36b     LCOUNT=0x00000000
THREADPTR=0x00000000
SCOMPARE1=0x00000100
BR=0x00000000
ACCLO=0x00000000    ACCHI=0x00000000
M0=0x00000000       M1=0x00000000       M2=0x00000000       M3=0x00000000
F64R_LO=0x00000000  F64R_HI=0x00000000  F64S=0x00000000
FCR=0x00000000      FSR=0x00000000
F0=0x00000000       F1=0x00000000       F2=0x00000000       F3=0x00000000       F4=0x00000000
F5=0x00000000       F6=0x00000000       F7=0x00000000       F8=0x00000000       F9=0x00000000
F10=0x00000000      F11=0x00000000      F12=0x00000000      F13=0x00000000      F14=0x00000000
F15=0x00000000

0x400fa018
wifi_set_config_process
    at ??:??
0x400f78b9
ieee80211_ioctl_process
    at ??:??
0x40083970
ppTask
    at ??:??
0x400e6480
core::sync::atomic::atomic_load
    at /Users/chris.l/.rustup/toolchains/esp/lib/rustlib/src/rust/library/core/src/sync/atomic.rs:3288
0x40000000
liebman commented 6 months ago

I've updated the gist so that its based on updated esp-hal & esp-wifi

MabezDev commented 6 months ago

@liebman there hasn't been a release since it was fixed, could you try from git main? Sorry, I wasn't clear!

liebman commented 6 months ago

esp-wifi fails to compile with esp-hal git main:

error: could not compile `esp-wifi` (lib) due to 3 previous errors
warning: build failed, waiting for other jobs to finish...
error[E0432]: unresolved import `hal::Rng`
  --> /Users/chris.l/.cargo/registry/src/index.crates.io-6f17d22bba15001f/esp-wifi-0.4.0/src/common_adapter/mod.rs:16:5
   |
16 | use hal::Rng;
   |     ^^^^^^^^ no `Rng` in the root
   |
help: a similar name exists in the module
   |
16 | use hal::rng;
   |          ~~~
help: consider importing one of these items instead
   |
16 | use crate::hal::rng::Rng;
   |     ~~~~~~~~~~~~~~~~~~~~
16 | use esp_hal::rng::Rng;
   |     ~~~~~~~~~~~~~~~~~

error[E0412]: cannot find type `Rng` in crate `hal`
   --> /Users/chris.l/.cargo/registry/src/index.crates.io-6f17d22bba15001f/esp-wifi-0.4.0/src/lib.rs:238:15
    |
238 |     rng: hal::Rng,
    |               ^^^ not found in `hal`
    |
help: consider importing one of these items
    |
17  + use crate::hal::rng::Rng;
    |
17  + use esp_hal::rng::Rng;
    |
help: if you import `Rng`, refer to it directly
    |
238 -     rng: hal::Rng,
238 +     rng: Rng,
    |

error: aborting due to 2 previous errors
MabezDev commented 6 months ago

Ah, I forgot about those breaking changes. Well I think I've solved this regardless, but next release we can test again :).