atomvm / AtomVM

Tiny Erlang VM
https://www.atomvm.net
Apache License 2.0
1.49k stars 105 forks source link

ESP32-S3 beta0 release img stack overflows, when empty/no avm is flashed. #1059

Closed petermm closed 7 months ago

petermm commented 8 months ago

Erased device - flashing release img for S3, and booting that (eg NOT flashing own app avm) stack overflows repeatedly:

AtomVM init.
I (5918) sys: Loaded BEAM partition main.avm at address 0x210000 (size=1048576 bytes)
Failed app start: invalid_avm.
Starting network...
I (6068) network_driver: AP ssid: AtomVM-ESP32
I (6068) network_driver: AP authmode: 4
I (6068) network_driver: AP ssid_hidden: 0
I (6068) network_driver: AP max_connection: 4
I (6078) pp: pp rom version: e7ae62f
I (6078) net80211: net80211 rom version: e7ae62f

***ERROR*** A stack overflow in task pthread has been detected.

Backtrace: 0x40375836:0x3fcb2450 0x4037e9d9:0x3fcb2470 0x4038125a:0x3fcb2490 0x403800f6:0x3fcb2510 0x4038135c:0x3fcb2530 0x40381352:0xa5a5a5a5 |<-CORRUPTED

replicate on device:

erase and flash S3 device - say on https://petermm.github.io/atomvm-web-tools/ - connect to console and see crashing..

replicate on wokwi:

download s3 release image - go to a wokwi S3 board eg https://wokwi.com/projects/390468884528509953 - press F1 - use "Upload Firmware and Start simulation" and upload S3 image - see it crash.. (stop the sim, as it will crash loop)

Local build replication:

crash only shows up if one copies the release-defaults in and build with that https://github.com/atomvm/AtomVM/blob/main/src/platforms/esp32/sdkconfig.release-defaults (they are copied in only on release GH builds).

CONFIG_COMPILER_OPTIMIZATION_PERF=y
CONFIG_COMPILER_OPTIMIZATION_ASSERTIONS_SILENT=y 

Guess:

Some bug in code path for invalid_avm - that only surfaces on -O2.

pguyot commented 8 months ago

I wouldn't personally write it is due to the sdkconfig but rather that a workaround is to edit the sdkconfig.

I believe it is a deeper bug.

petermm commented 8 months ago

I wouldn't personally write it is due to the sdkconfig but rather that a workaround is to edit the sdkconfig.

I believe it is a deeper bug.

agreed, updated issue.

bettio commented 8 months ago

The crash happens inside esp-idf when calling esp_wifi_init(), and it can be reproduced enabling CONFIG_COMPILER_OPTIMIZATION_PERF=y.

It can be fixed setting as default pthread stack size (that is 3072) to the same used for main task, such as CONFIG_PTHREAD_TASK_STACK_SIZE_DEFAULT=3584, but even a smaller value works (such as 3200).

I think that esp_wifi_init() in most firmwares around is called from main, so likely it has been tested mostly with 3584 bytes stack.

I have 2 hypothesis about this bug:

Changing sdkconfig works too, but I don't think that increasing the default size for all threads is the right thing to do.