Closed yizi15 closed 10 months ago
@yizi15
Please share your sample code along with the sdkconfig. You may modify https_request
example from IDF to recreate your failure case and share the patch here.
@mahavirj I found a way to reproduce https download failure, but the error log is different from the previous one. I (15794) TAG: Connected to AP, begin http example I (46240) esp-x509-crt-bundle: Certificate validated E (48828) esp-aes: Timed out waiting for completion of AES Interrupt E (48829) esp-aes: esp_aes_dma_wait_complete failed E (48829) esp-tls-mbedtls: read error :-0x0001: E (48840) transport_base: esp_tls_conn_read error, errno=Connection already in progress E (48852) https_download_begin: Failed to establish HTTP connection
@yizi15
AES interrupt timeout issue was recently fixed on release/v5.1
branch with https://github.com/espressif/esp-idf/commit/560518939871895b53db6a55c8d800aa42355894. Can you please try your sample application on latest release/v5.1
branch once?
@mahavirj I checkout release/V5.1, the log is below: I (14781) example_common: - IPv6 address: fe80:0000:0000:0000:4a27:e2ff:fee9:0fa8, type: ESP_IP6_ADDR_IS_LINK_LOCAL I (14792) TAG: Connected to AP, begin http example E (22426) MPI: Timed out waiting for completion of MPI Interrupt E (24444) ERROR A stack overflow in task tst has been detected.
Backtrace: 0x40379fce:0x3fcd5410 0x40384225:0x3fcd5430 0x403870d6:0x3fcd5450 0x403859ff:0x3fcd54d0 0x40387224:0x3fcd54f0 0x4038721a:0x00000100 |<-CORRUPTED 0x40379fce: panic_abort at D:/Espressif5/frameworks/esp_idf_v5.1_test/esp-idf/components/esp_system/panic.c:452
0x40384225: esp_system_abort at D:/Espressif5/frameworks/esp_idf_v5.1_test/esp-idf/components/esp_system/port/esp_system_chip.c:84
0x403870d6: vApplicationStackOverflowHook at D:/Espressif5/frameworks/esp_idf_v5.1_test/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:581
0x403859ff: vTaskSwitchContext at D:/Espressif5/frameworks/esp_idf_v5.1_test/esp-idf/components/freertos/FreeRTOS-Kernel/tasks.c:3728
0x40387224: _frxt_dispatch at D:/Espressif5/frameworks/esp_idf_v5.1_test/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/portasm.S:450
0x4038721a: _frxt_int_exit at D:/Espressif5/frameworks/esp_idf_v5.1_test/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/portasm.S:245
the stack of task 'tst' is 4096
when i clone the idf, no heap\tlsf found, then i copy the dir from v5.1.1 (download from https://dl.espressif.cn/dl/idf-installer/esp-idf-tools-setup-offline-5.1.1.exe)
http_example.zip is run at ESP32-S3-DevKitC-1 V1.0
the bug is only reproduce on the audio(i2s0 i2c0), uart1 and i2c1 enable at the same,
@yizi15
Some observations from the sample app you shared:
app_main
, you can simply return from it. Once returned, the main task shall be deleted.vTaskSuspend
, vTaskDelete
while (1)
{
vTaskDelay(1);
}
uxTaskGetStackHighWaterMark
. Typically for TLS related operations, we recommend stack sizes to be well above 5-6KB. Now, the stack is 8192. ' UBaseType_t WaterMark = uxTaskGetStackHighWaterMark(NULL); ESP_LOGI("", "WaterMark %u\n", WaterMark);'
I (24954) esp-x509-crt-bundle: Certificate validated E (28023) esp-aes: Timed out waiting for completion of AES Interrupt E (28024) esp-aes: esp_aes_dma_wait_complete failed E (28024) esp-tls-mbedtls: read error :-0x0001: E (28035) transport_base: esp_tls_conn_read error, errno=Connection already in progress E (28047) https_download_begin: Failed to establish HTTP connection I (28050) : WaterMark 4644
assert failed: https_download_perform https_download.c:185 (handle)
@mahavirj Can you reproduce this bug?
@yizi15 Please cherry-pick the fix from https://github.com/espressif/esp-idf/issues/12289#issuecomment-1736619821 and see if that fixes your problem.
@yizi15 Please cherry-pick the fix from #12289 (comment) and see if that fixes your problem.
It's my local code , it is same to https://github.com/espressif/esp-idf/commit/560518939871895b53db6a55c8d800aa42355894
Hello @yizi15 I tried https_request example with the URL from the sample application you shared. I am unable to reproduce the issue. Can you share a minimal application that can reproduce this issue?
Here are my logs:
I (5440) example: Start https_request example
I (5440) example: https_request using crt bundle
I (5450) main_task: Returned from app_main()
I (5460) wifi:<ba-add>idx:1 (ifx:0, 68:d7:9a:81:31:24), tid:0, ssn:0, winSize:64
I (6690) esp-x509-crt-bundle: Certificate validated
I (7570) example: Connection established...
I (7570) example: 137 bytes written
I (7580) example: Reading HTTP response...
I (221250) example: connection closed
I (221250) example: 10...
I (222250) example: 9...
I (223250) example: 8...
I (224250) example: 7...
I (225250) example: 6...
I (226250) example: 5...
@hmalpani http_example.zip
when i clone the idf, no heap\tlsf found, then i copy the dir from v5.1.1 (download from https://dl.espressif.cn/dl/idf-installer/esp-idf-tools-setup-offline-5.1.1.exe)
Hello @yizi15, It looks like that you have had another issue during this with installed esp-idf
.
I haven't been able to reproduce this, the directory should be there so it must have been some exceptional case. Do you have installation logs?
@jakub-kocka
When I remove heal/tlsf, run "git submodule update", heal/tlsf works fine. There may be network fluctuations during the first installation. However, AES also has timeout issues.
Hello @yizi15 Can you please try with the suggestions here and let me know if they helps resolving the issue.
@hmalpani All the suggestions were tried but no work,
Hello
I have the same issue but in different context. Im my case MPI timeout is hit when trying to connect to a MQTT broker with dual authentication Here is the log of the error
E (18:39:51.102) MPI: Timed out waiting for completion of MPI Interrupt E (18:39:53.122) MPI: Timed out waiting for completion of MPI Interrupt E (18:39:55.122) MPI: Timed out waiting for completion of MPI Interrupt E (18:39:55.125) esp-tls-mbedtls: mbedtls_ssl_handshake returned -0x2700 I (18:39:55.126) esp-tls-mbedtls: Failed to verify peer certificate! E (18:39:55.126) esp-tls: Failed to open new connection E (18:39:55.127) transport_base: Failed to open a new connection E (18:39:55.130) mqtt_client: Error transport connect I (18:39:55.131) MQTTClient: MQTT_EVENT_ERROR E (18:39:55.132) MQTTClient: Last error reported from esp-tls: 0x801a E (18:39:55.132) MQTTClient: Last error reported from tls stack: 0x2700 I (18:39:55.133) MQTTClient: Last errno string (Success) I (18:39:55.133) MQTTClient: MQTT_EVENT_DISCONNECTED
MQTTClient: tag comes from my mqtt client warpper
I come into this issue while porting the code from 4.4.3 to 5.1.1. On 4.4 everything works fine and no MPI timeot. The strange thing is that i ported 2 applications. App 1 is the main application which connects to the mqtt broker and shows this issue. App 2 is a related OTA application, which the same, connects to mqtt broker, but also to a https server to fetch the firmware, and this one works fine! Both applictions are using the same certificates, and basically the same mqtt client wrapper. Sometimes, right after flashing. app 1 works fine, but after first reset goes into the issue.
The workaround i found is to undef CONFIG_MBEDTLS_MPI_USE_INTERRUPT. After undefining app 1 works fine. Can you comment what is the impact of this workaround?
@ves011 My code work fine at espidf_v4.4.2 and esp_adf_v2.4. When upgrade to esp_idf_V5.1.1 and esp_adf V2.6(by esp-idf-tools-setup-offline-5.1.1.exe), the bug appear firstly. In my project, when init the uart1, i2c1 by api of idf and init the audio module (i2s0 and i2c0) at same time, AES/MPI timeout will occur. When not use api of adf, and init i2s0 and i2c0 by api of idf, it work fine. Today, i checkout the release/V5.1, and my bug seems to fixed, even though the https download speeds only is 30k/s.
I have met similar issue, the cause is that interrupt is not enough for AES. If you are use 2 core, you can move some interrupt to core 1, you can simply move some peripheral initialized code to another core, then the interrupt alloced by the peripheral in initialized API will more to core 1. Sample code like below:
#define RUN_ASYNC(thread, core, body) \
void async_task##thread(void* arg) { \
body; \
vTaskDelete(NULL); \
} \
xTaskCreatePinnedToCore(async_task##thread, #thread, 4*1024, NULL, 3, NULL, core);
static int init_peripheral(void)
{
_i2s_init();
_i2c_init();
init_lcd();
RUN_ASYNC(i_core1, 1, {
mount_sdcard();
init_console();
});
return 0;
}
These are just workarounds :( For me it looks like a degradation which needs to be fixed. @yizi15 had the same problem while migrating the code to v5 (on my understanding) The first step is for espressif guys to accept this is an issue.
@yizi15 you said "i checkout the release/V5.1, and my bug seems to fixed". Do you know what is the fix?
Again can someone comment what is the impact of disabling CONFIG_MBEDTLS_MPI_USE_INTERRUPT?
@ves011 '"i checkout the release/V5.1, and my bug seems to fixed' , it's not always work. I move the audio to core1 and work ok.
I have met similar issue, the cause is that interrupt is not enough for AES. If you are use 2 core, you can move some interrupt to core 1, you can simply move some peripheral initialized code to another core, then the interrupt alloced by the peripheral in initialized API will more to core 1. Sample code like below:
#define RUN_ASYNC(thread, core, body) \ void async_task##thread(void* arg) { \ body; \ vTaskDelete(NULL); \ } \ xTaskCreatePinnedToCore(async_task##thread, #thread, 4*1024, NULL, 3, NULL, core); static int init_peripheral(void) { _i2s_init(); _i2c_init(); init_lcd(); RUN_ASYNC(i_core1, 1, { mount_sdcard(); init_console(); }); return 0; }
I am also hitting this issue in v5.1.1, when calling mbedtls_rsa_pkcs1_sign
, esp32-s3
edit: same issue on v5.1.2,
edit: tried changing the timeout to 60s, same issue
edit: CONFIG_MBEDTLS_MPI_USE_INTERRUPT=n
solves it!
Update:
CONFIG_ESP_SYSTEM_MEMPROT_FEATURE
) feature on S3. This feature takes an additional interrupt (ETS_MEMACCESS_ERR_INUM
). And hence for the projects based on ESP-IDF 4.x, with no interrupts available (for a given core), migration could result in some issues.Again can someone comment what is the impact of disabling CONFIG_MBEDTLS_MPI_USE_INTERRUPT?
There is no harm in disabling this option, it will keep the CPU in busy-wait state for the MPI operation which otherwise could have allowed other code to run on it. @ves011
I am also hitting this issue in v5.1.1, when calling mbedtls_rsa_pkcs1_sign
Could you please check if there was any error corresponding to MPI interrupt allocation failure? @chipweinberger
@mahavirj , there are no "interrupt allocation failed" logs in my output
@chipweinberger My bad. Please put some debugs print here and log the return value.
Note: Per my earlier comment, this code is already fixed with https://github.com/espressif/esp-idf/commit/d3be7bda056dfcb7edbac4cf8757ec1ac7d0a1ea and backports are WIP.
Again can someone comment what is the impact of disabling CONFIG_MBEDTLS_MPI_USE_INTERRUPT?
There is no harm in disabling this option, it will keep the CPU in busy-wait state for the MPI operation which otherwise could have allowed other code to run on it. @ves011
thanks @mahavirj Will keep it disable until the fix will be released
Hello @chipweinberger, @ves011,
Recently we merged some fixes in the AES and MPI modules that were backported to release/v5.1
as well. I would recommend you to check if the issue still persists with the branch release/v5.1
. Fixes that were merged were: AES-DMA related fixes (https://github.com/espressif/esp-idf/commit/2fbe919e4cafba8ce2528e8eaa21bfe7fbbe7d8f and https://github.com/espressif/esp-idf/commit/560518939871895b53db6a55c8d800aa42355894), interrupt related fixes (https://github.com/espressif/esp-idf/commit/4c30f2a4a0229979b4700c6acb34d49705fb6670, https://github.com/espressif/esp-idf/commit/e8a5d9ef989f958be9ea6d07280c13f7601a9f47).
Closing this issue as of now, feel free to re-open if the issue still persists.
Answers checklist.
IDF version.
v5.1.1
Operating System used.
Windows
How did you build your project?
Command line with idf.py
If you are using Windows, please specify command line type.
PowerShell
Development Kit.
ESP32-S3-WROOM-1 MCN8R8
Power Supply used.
Battery
What is the expected behavior?
i need get a https handle to download file
What is the actual behavior?
Some times, the function is return null, Some times, function return handle and could download the file from https At esp-idf v4.4, the function blow is work ok.
Steps to reproduce.
`HttpsDownloadHandle https_download_begin(const esp_http_client_config_t http_config, struct https_download_arg* arg) {
if(http_config == nullptr || arg == nullptr || arg->https_buf == nullptr || arg->https_buf_size < arg->header_size || arg->header_size < 0) { return nullptr; } HttpsDownloadHandle self = ql_malloc(sizeof(HttpsDownloadHandle)); assert_ret(self, nullptr); memset(self, 0, sizeof(HttpsDownloadHandle)); self->http_client = esp_http_client_init(http_config); if (self->http_client == NULL) { log_error("Failed to initialise HTTP connection"); goto failure; } esp_err_t err = _http_connect(self->http_client); if (err != ESP_OK) { ESP_LOGE(TAG, "Failed to establish HTTP connection"); goto http_cleanup; } self->state = e_https_download_init; self->arg = arg; return self; http_cleanup: _http_cleanup(self->http_client); failure: if(self) { ql_free(self); } return nullptr; };`
Config is below:
' esp_http_client_config_t config = { .url = GET_VER_INFO(this_type).image_path,//https://sleeppydata.oss-cn-beijing.aliyuncs.com/xxx .timeout_ms = 60000, .keep_alive_enable = true, .skip_cert_common_name_check = true, .crt_bundle_attach = esp_crt_bundle_attach, };'
Debug Logs.
More Information.
No response