esp-rs / esp-wifi-sys

Wi-Fi and BT drivers packaged for integration into bare-metal esp-wifi.
Apache License 2.0
399 stars 93 forks source link

[ESP32] esp-wifi crash if used with embassy and any task on the second core #412

Closed liebman closed 7 months ago

liebman commented 10 months ago

On esp32 if any embassy task is started on the second core it crashes.

Code to reproduce is here in this gist

crashes with:

rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:7104
load:0x40078000,len:15576
load:0x40080400,len:4
ho 8 tail 4 room 4
load:0x40080404,len:3876
entry 0x4008064c
I (31) boot: ESP-IDF v5.1-beta1-378-gea5e0ff298-dirt 2nd stage bootloader
I (31) boot: compile time Jun  7 2023 07:48:23
I (33) boot: Multicore bootloader
I (37) boot: chip revision: v0.0
I (41) boot.esp32: SPI Speed      : 40MHz
I (46) boot.esp32: SPI Mode       : DIO
I (50) boot.esp32: SPI Flash Size : 4MB
I (55) boot: Enabling RNG early entropy source...
I (60) boot: Partition Table:
I (64) boot: ## Label            Usage          Type ST Offset   Length
I (71) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (78) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (86) boot:  2 factory          factory app      00 00 00010000 003f0000
I (93) boot: End of partition table
I (98) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=17bf4h ( 97268) map
I (141) esp_image: segment 1: paddr=00027c1c vaddr=3ffb0000 size=00e0ch (  3596) load
I (143) esp_image: segment 2: paddr=00028a30 vaddr=3ffbcd04 size=00190h (   400) load
I (147) esp_image: segment 3: paddr=00028bc8 vaddr=40080000 size=07450h ( 29776) load
I (168) esp_image: segment 4: paddr=00030020 vaddr=400d0020 size=581c0h (360896) map
I (298) esp_image: segment 5: paddr=000881e8 vaddr=40087450 size=0398ch ( 14732) load
I (310) boot: Loaded app from partition at offset 0x10000
I (310) boot: Disabling RNG early entropy source...
Starting enable_disable_led() on core 0
Sending LED on
Starting control_led() on core 1
LED on
start connection task
Device capabilities: Ok(EnumSet(Client))
making wifi client config
set wifi configuration

Exception occured 'StoreProhibited'
Context
PC=0x400ff616       PS=0x00060510
A0=0x800f7caa       A1=0x3ffb62d0       A2=0x00000003       A3=0x3ffbb134       A4=0x00000000
A5=0x3ffe43c8       A6=0x00000001       A7=0x3ffdb8e0       A8=0x800ff5c4       A9=0x3ffb62b0
A10=0x00000000      A11=0x3ffe43ec      A12=0x00000003      A13=0x3f414309      A14=0x00000000
SAR=00000008
EXCCAUSE=0x0000001d EXCVADDR=0x00000003
LBEG=0x40001609     LEND=0x4000160d     LCOUNT=0x00000000
THREADPTR=0x00000000
SCOMPARE1=0x00000001
BR=0x00000000
ACCLO=0x00000000    ACCHI=0x00000000
M0=0x00000000       M1=0x00000000       M2=0x00000000       M3=0x00000000
F64R_LO=0x00000000  F64R_HI=0x00000000  F64S=0x00000000
FCR=0x00000000      FSR=0x00000000
F0=0x00000000       F1=0x00000000       F2=0x00000000       F3=0x00000000       F4=0x00000000
F5=0x00000000       F6=0x00000000       F7=0x00000000       F8=0x00000000       F9=0x00000000
F10=0x00000000      F11=0x00000000      F12=0x00000000      F13=0x00000000      F14=0x00000000
F15=0x00000000

0x400fa22c
0x400fa22c - wifi_set_config_process
    at ??:??
0x400f7bcd
0x400f7bcd - ieee80211_ioctl_process
    at ??:??
0x400838d4
0x400838d4 - ppTask
    at ??:??
0x400e757c
0x400e757c - core::sync::atomic::atomic_load
    at /Users/chris.l/.rustup/toolchains/esp/lib/rustlib/src/rust/library/core/src/sync/atomic.rs:3288
0x40000000
0x40000000 - _ZN8esp_wifi9HEAP_DATA17h68314300b948c320E
    at ??:??
bjoernQ commented 10 months ago

Interesting thing here is that it works fine on ESP32-S3

bjoernQ commented 10 months ago

Even running just an empty loop on the second core makes it crash. Also, if the second core was just running before anything else for a very short time it crashes later

bjoernQ commented 9 months ago

This is really weird - I can make it not crash anymore by commenting out https://github.com/esp-rs/esp-wifi/blob/ce4264907916bc3c1a3406075c9d3c0e05cb89d3/esp-wifi/src/lib.rs#L144 - i.e. NOT placing the wifi heap in dram2 segment

While this solves the issue I have no idea why - @MabezDev any idea what might be special about dram2_segment?

bjoernQ commented 9 months ago

I tried something: In esp-hal multicore example I zero all bytes in dram2_segment, run the code and check if those bytes are still zero ..... they are not. There are at least ~16k of non-zero bytes in the beginning of that segment.

Unfortunately, just letting dram2_segment start at a 16k higher address doesn't solve the problems here

MabezDev commented 7 months ago

@liebman When you get a chance, could you test the linked PR and see if it fixes your issue?

liebman commented 7 months ago

Still crashes :-(

rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:7104
0x3fff0030 - esp_wifi::HEAP_DATA
    at ??:??
load:0x40078000,len:15576
load:0x40080400,len:4
ho 8 tail 4 room 4
load:0x40080404,len:3876
entry 0x4008064c
0x4008064c - core::fmt::Arguments::new_v1
    at /Users/chris.l/.rustup/toolchains/esp/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:335
I (31) boot: ESP-IDF v5.1-beta1-378-gea5e0ff298-dirt 2nd stage bootloader
I (31) boot: compile time Jun  7 2023 07:48:23
I (33) boot: Multicore bootloader
I (37) boot: chip revision: v0.0
I (41) boot.esp32: SPI Speed      : 40MHz
I (45) boot.esp32: SPI Mode       : DIO
I (50) boot.esp32: SPI Flash Size : 4MB
I (55) boot: Enabling RNG early entropy source...
I (60) boot: Partition Table:
I (63) boot: ## Label            Usage          Type ST Offset   Length
I (71) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (78) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (86) boot:  2 factory          factory app      00 00 00010000 003f0000
I (93) boot: End of partition table
I (97) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=186a8h (100008) map
I (142) esp_image: segment 1: paddr=000286d0 vaddr=3ffb0000 size=00e14h (  3604) load
I (144) esp_image: segment 2: paddr=000294ec vaddr=3ffbcffc size=00190h (   400) load
I (148) esp_image: segment 3: paddr=00029684 vaddr=40080000 size=06994h ( 27028) load
I (168) esp_image: segment 4: paddr=00030020 vaddr=400d0020 size=57d04h (359684) map
I (298) esp_image: segment 5: paddr=00087d2c vaddr=40086994 size=06304h ( 25348) load
I (315) boot: Loaded app from partition at offset 0x10000
I (315) boot: Disabling RNG early entropy source...
Starting enable_disable_led() on core 0
Sending LED on
Starting control_led() on core 1
LED on
start connection task
Device capabilities: Ok(EnumSet(Client))
making wifi client config
set wifi configuration

Exception occured 'StoreProhibited'
Context
PC=0x400ff926       PS=0x00060510
0x400ff926 - wifi_nvs_set
    at ??:??
A0=0x800f79a1       A1=0x3ffb6040       A2=0xffff0003       A3=0x3ffbb464       A4=0x00000000
0x3ffb6040 - esp_wifi::preempt::TASK_STACK
    at ??:??
0x3ffbb464 - s_wifi_nvs
    at ??:??
A5=0x3ffe43c8       A6=0x00000001       A7=0x00000000       A8=0x800ff8d4       A9=0x3ffb6020
0x3ffe43c8 - esp_wifi::HEAP_DATA
    at ??:??
0x3ffb6020 - esp_wifi::preempt::TASK_STACK
    at ??:??
A10=0x00000000      A11=0x00000001      A12=0x800e7231      A13=0x3f414d55      A14=0x00000000
A15=0x00060500
SAR=00000008
EXCCAUSE=0x0000001d EXCVADDR=0xffff0003
LBEG=0x4000c349     LEND=0x4000c36b     LCOUNT=0x00000000
THREADPTR=0x00000000
SCOMPARE1=0x00000100
BR=0x00000000
ACCLO=0x00000000    ACCHI=0x00000000
M0=0x00000000       M1=0x00000000       M2=0x00000000       M3=0x00000000
F64R_LO=0x00000000  F64R_HI=0x00000000  F64S=0x00000000
FCR=0x00000000      FSR=0x00000000
F0=0x00000000       F1=0x00000000       F2=0x00000000       F3=0x00000000       F4=0x00000000
F5=0x00000000       F6=0x00000000       F7=0x00000000       F8=0x00000000       F9=0x00000000
F10=0x00000000      F11=0x00000000      F12=0x00000000      F13=0x00000000      F14=0x00000000
F15=0x00000000

0x400fa018
wifi_set_config_process
    at ??:??
0x400f78b9
ieee80211_ioctl_process
    at ??:??
0x40083970
ppTask
    at ??:??
0x400e6480
core::sync::atomic::atomic_load
    at /Users/chris.l/.rustup/toolchains/esp/lib/rustlib/src/rust/library/core/src/sync/atomic.rs:3288
0x40000000
liebman commented 7 months ago

I've updated the gist so that its based on updated esp-hal & esp-wifi

MabezDev commented 7 months ago

@liebman there hasn't been a release since it was fixed, could you try from git main? Sorry, I wasn't clear!

liebman commented 7 months ago

esp-wifi fails to compile with esp-hal git main:

error: could not compile `esp-wifi` (lib) due to 3 previous errors
warning: build failed, waiting for other jobs to finish...
error[E0432]: unresolved import `hal::Rng`
  --> /Users/chris.l/.cargo/registry/src/index.crates.io-6f17d22bba15001f/esp-wifi-0.4.0/src/common_adapter/mod.rs:16:5
   |
16 | use hal::Rng;
   |     ^^^^^^^^ no `Rng` in the root
   |
help: a similar name exists in the module
   |
16 | use hal::rng;
   |          ~~~
help: consider importing one of these items instead
   |
16 | use crate::hal::rng::Rng;
   |     ~~~~~~~~~~~~~~~~~~~~
16 | use esp_hal::rng::Rng;
   |     ~~~~~~~~~~~~~~~~~

error[E0412]: cannot find type `Rng` in crate `hal`
   --> /Users/chris.l/.cargo/registry/src/index.crates.io-6f17d22bba15001f/esp-wifi-0.4.0/src/lib.rs:238:15
    |
238 |     rng: hal::Rng,
    |               ^^^ not found in `hal`
    |
help: consider importing one of these items
    |
17  + use crate::hal::rng::Rng;
    |
17  + use esp_hal::rng::Rng;
    |
help: if you import `Rng`, refer to it directly
    |
238 -     rng: hal::Rng,
238 +     rng: Rng,
    |

error: aborting due to 2 previous errors
MabezDev commented 7 months ago

Ah, I forgot about those breaking changes. Well I think I've solved this regardless, but next release we can test again :).