espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
12.56k stars 7.01k forks source link

stack/heap memory corruption? (IDFGH-12726) #13709

Open hyuan-kamuda opened 2 weeks ago

hyuan-kamuda commented 2 weeks ago

Answers checklist.

IDF version.

v5.2.1

Espressif SoC revision.

Chip rev: v1.0

Operating System used.

Windows

How did you build your project?

VS Code IDE

If you are using Windows, please specify command line type.

None

Development Kit.

ESP32-LyraTD-MSC(ESP32-WROVER-E)

Power Supply used.

USB

What is the expected behavior?

It should have no stack overflow or heap corruption.

What is the actual behavior?

with different size definition( #define MAX_HTTP_OUTPUT_BUFFER 512 or 1024 or 2048) , it has different behavior:

512: normal 1024: stack overflow 2048: heap corruption

Steps to reproduce.

This is a very simple https communication application, just copied and modified from IDF sample application of "esp_http_client". app_main code copied below, the FreeRTOS task is created with stack size of 8192(8K), should be enough?

//#define MAX_HTTP_OUTPUT_BUFFER 512 //#define MAX_HTTP_OUTPUT_BUFFER 1024

define MAX_HTTP_OUTPUT_BUFFER 2048

void talk_to_server(void) { char local_response_buffer[MAX_HTTP_OUTPUT_BUFFER+1] = {0}; esp_http_client_config_t config = { .url = SERVER_WEB_URL, .event_handler = _http_event_handler, .crt_bundle_attach = esp_crt_bundle_attach, .user_data = local_response_buffer, //.method = HTTP_METHOD_POST, };

esp_http_client_handle_t client = esp_http_client_init(&config);

...
...

esp_http_client_cleanup(client);

}

static void app_task(void *pvparameters) { talk_to_server(); ESP_LOGI(AITALKTAG, "Finish talking to server."); for (int countdown = 10; countdown >= 0; countdown--) { ESP_LOGI(TALKTAG, "%d...", countdown); vTaskDelay(1000 / portTICK_PERIOD_MS); }
vTaskDelete(NULL); }

void app_main(void) { //Initialize NVS esp_err_t ret = nvs_flash_init(); if (ret == ESP_ERR_NVS_NO_FREE_PAGES || ret == ESP_ERR_NVS_NEW_VERSION_FOUND) { ESP_ERROR_CHECK(nvs_flash_erase()); ret = nvs_flash_init(); } ESP_ERROR_CHECK(ret);

if(ESP_OK!=connect_wifi()){
    ESP_LOGE(TALKTAG,"Connect to Wifi is fail, aborting...");
} else{
   ESP_LOGI(TALKTAG, "Connected to Wifi, starting the application...");
   xTaskCreate(&app_task, "app_task", 8192, NULL, 5, NULL);
}

}

When #define MAX_HTTP_OUTPUT_BUFFER 512, application runs smoothly and can receive server's response. When #define MAX_HTTP_OUTPUT_BUFFER 1024, the stack overflow will happen, see debug log below When #define MAX_HTTP_OUTPUT_BUFFER 2048, the heap corruption will happen.

Debug Logs.

if the MACRO(MAX_HTTP_OUTPUT_BUFFER) to be defined as 1024, then Stack overflow will happen:

***ERROR*** A stack overflow in task talk_to_server has been detected.

Backtrace: 0x40081796:0x3ffcdf00 0x40089091:0x3ffcdf20 0x40089f8a:0x3ffcdf40 0x4008b273:0x3ffcdfc0 0x4008a094:0x3ffcdfe0 0x4008a046:0x00000000 |<-CORRUPTED
0x40081796: panic_abort at C:/Users/hyuan/esp/esp-idf/components/esp_system/panic.c:472
0x40089091: esp_system_abort at C:/Users/hyuan/esp/esp-idf/components/esp_system/port/esp_system_chip.c:93
0x40089f8a: vApplicationStackOverflowHook at C:/Users/hyuan/esp/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:553
0x4008b273: vTaskSwitchContext at C:/Users/hyuan/esp/esp-idf/components/freertos/FreeRTOS-Kernel/tasks.c:3630 (discriminator 7)
0x4008a094: _frxt_dispatch at C:/Users/hyuan/esp/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/portasm.S:451
0x4008a046: _frxt_int_exit at C:/Users/hyuan/esp/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/portasm.S:246
==========================================================================================

Compiled result(DRAM/IRAM usage) as below when #define MAX_HTTP_OUTPUT_BUFFER 512 :

Total sizes:
Used static DRAM: 33852 bytes ( 146884 remain, 18.7% used)
.data size: 15348 bytes
.bss size: 18504 bytes
Used static IRAM: 91606 bytes ( 39466 remain, 69.9% used)
.text size: 90579 bytes
.vectors size: 1027 bytes
Used Flash size : 847223 bytes
.text: 623595 bytes
.rodata: 223372 bytes
Total image size: 954177 bytes (.bin may be padded larger)

I (492) cpu_start: ESP-IDF: v5.2.1
I (497) cpu_start: Min chip rev: v0.0
I (502) cpu_start: Max chip rev: v3.99
I (507) cpu_start: Chip rev: v1.0
I (512) heap_init: Initializing. RAM available for dynamic allocation:
I (519) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (525) heap_init: At 3FFB8440 len 00027BC0 (158 KiB): DRAM
I (531) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (537) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (544) heap_init: At 400965D8 len 00009A28 (38 KiB): IRAM
I (552) spi_flash: detected chip: generic
I (555) spi_flash: flash io: dio
I (560) main_task: Started on CPU0
I (570) main_task: Calling app_main()

Compiled result(DRAM/IRAM usage) as below when #define MAX_HTTP_OUTPUT_BUFFER 1024 :
Total sizes:
Used static DRAM: 33852 bytes ( 146884 remain, 18.7% used)
.data size: 15348 bytes
.bss size: 18504 bytes
Used static IRAM: 91606 bytes ( 39466 remain, 69.9% used)
.text size: 90579 bytes
.vectors size: 1027 bytes
Used Flash size : 847179 bytes
.text: 623551 bytes
.rodata: 223372 bytes
Total image size: 954133 bytes (.bin may be padded larger)

I (492) cpu_start: ESP-IDF: v5.2.1
I (497) cpu_start: Min chip rev: v0.0
I (502) cpu_start: Max chip rev: v3.99
I (507) cpu_start: Chip rev: v1.0
I (512) heap_init: Initializing. RAM available for dynamic allocation:
I (519) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (525) heap_init: At 3FFB8440 len 00027BC0 (158 KiB): DRAM
I (531) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (537) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (544) heap_init: At 400965D8 len 00009A28 (38 KiB): IRAM
I (551) spi_flash: detected chip: generic
I (554) spi_flash: flash io: dio
I (559) main_task: Started on CPU0
I (569) main_task: Calling app_main()

More Information.

The further experiment(using heap) seems tell it has relationship with memory allocation, please refer to below code change in talk_to_server:

Changing local var of local_response_buffer in talk_to_server function to allocate memory from heap, and print free heap size before/after allocation:

_#define MAX_HTTP_OUTPUT_BUFFER 2048

void talk_to_server(void) { char *local_response_buffer = NULL; int heapleft = heap_caps_get_free_size(MALLOC_CAP_8BIT); ESP_LOGI(TALKTAG, "free heap before malloc :\n%d ", heapleft); local_response_buffer = malloc(MAX_HTTP_OUTPUT_BUFFER+ 1); heapleft = heap_caps_get_free_size(MALLOC_CAP_8BIT); ESP_LOGI(TALKTAG, "free heap after malloc :\n%d ", heapleft); if (NULL != local_response_buffer) { esp_http_client_config_t config = { .url = SERVER_WEB_URL, .event_handler = _http_event_handler, .crt_bundle_attach = esp_crt_bundle_attach, .user_data = local_response_buffer, //.method = HTTP_METHOD_POST, };

  esp_http_client_handle_t client = esp_http_client_init(&config);

      ...
 //

  ...

     esp_http_client_cleanup(client);
}

}_

from log information of free heap size before/after malloc, heap should be sufficient( there're more than 100K free heap either before or after allocation):

I (1879) Talk: free heap before malloc : 195024 I (1889) Talk: free heap after malloc : 192968

I (1919) main_task: Returned froGuru Meditation Error: Core 0 panic'ed (LoadProhibited). Exception was unhandled.

Core 0 register dump: PC : 0x40108c46 PS : 0x00060130 A0 : 0x8010a7e8 A1 : 0x3ffc02e0 0x40108c46: ieee80211_send_setup at ??:?

A2 : 0x00000001 A3 : 0x3ffd1208 A4 : 0x3ffb7fbc A5 : 0x00000010 A6 : 0x00000005 A7 : 0x00000005 A8 : 0x3ffb7fbc A9 : 0x3ffc02d0 A10 : 0x00000048 A11 : 0x00000003 A12 : 0x00000018 A13 : 0x00000000 A14 : 0x00000001 A15 : 0x00000000 SAR : 0x00000011 EXCCAUSE: 0x0000001c EXCVADDR: 0x00000001 LBEG : 0x4000c46c LEND : 0x4000c477 LCOUNT : 0x00000000 0x4000c46c: memset in ROM 0x4000c477: memset in ROM

Backtrace: 0x40108c43:0x3ffc02e0 0x4010a7e5:0x3ffc0310 0x4010a828:0x3ffc0350 0x40116c16:0x3ffc0370 0x4011719f:0x3ffc0390 0x40117c6f:0x3ffc03b0 0x401191d5:0x3ffc03d0 0x40164599:0x3ffc03f0 0x40092216:0x3ffc0410 0x40089b9d:0x3ffc0440 0x40108c43: ieee80211_send_setup at ??:? 0x4010a7e5: ieee80211_encap_null_data at ??:? 0x4010a828: ieee80211_pm_tx_null_process at ??:? 0x40116c16: pm_send_nullfunc at ??:? 0x4011719f: pm_go_to_sleep at ??:? 0x40117c6f: pm_active_timeout_process at ??:? 0x401191d5: dbg_lmac_ps_statis_reset at ??:? 0x40164599: pp_timer_do_process at ??:? 0x40092216: ppTask at ??:? 0x40089b9d: vPortTaskWrapper at C:/Users/hyuan/esp/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:134

ELF file SHA256: 4a040b880

CPU halted.

hyuan-kamuda commented 2 weeks ago

sdkconfig.txt

Attached with SDKconfig for your information.

Thanks

mahavirj commented 1 week ago

@hyuan-kamuda

It is difficult to comment on this issue without looking at the entire sample application code. Interim, you may increase the stack size and then fine tune it as per this documentation guide: https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-guides/performance/ram-usage.html#reducing-stack-sizes. This will help eliminate any issues w.r.t. insufficient stack size for the task (given that you have some data buffers placed in stack memory).

Hope this helps!

hyuan-kamuda commented 1 week ago

@hyuan-kamuda

It is difficult to comment on this issue without looking at the entire sample application code. Interim, you may increase the stack size and then fine tune it as per this documentation guide: https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-guides/performance/ram-usage.html#reducing-stack-sizes. This will help eliminate any issues w.r.t. insufficient stack size for the task (given that you have some data buffers placed in stack memory).

Hope this helps!

Thanks. Increasing stack size can fix the issue, however, changing code in function to allocate buffer from heap(without increasing task stack size), instead of defining it as char array(from stack), the issue is still there. Guessing ESP lib will copy the data into local var (so using big memory from stack) although user application allocating memory from heap? Please refer to: https://www.esp32.com/viewtopic.php?f=13&t=39614

mahavirj commented 1 day ago

@hyuan-kamuda

Can you please supply minimal application code to recreate this failure?

hyuan-kamuda commented 1 day ago

@hyuan-kamuda

Can you please supply minimal application code to recreate this failure?

I thought I already provided code in "steps to reproduce" above?

Thanks

mahavirj commented 1 day ago

It will be good if you can attach the entire application with sdkconfig that you are using

hyuan-kamuda commented 1 day ago

It will be good if you can attach the entire application with sdkconfig that you are using

Both provided in above.