Closed giorgi877 closed 1 year ago
Hello, @giorgi877!
Thank you for sending the issue report. I was investigating this issue but unfortunately could not reproduce it with the code snippet you provided on the ESP-IDF v4.4
tag.
Could you please attach the following -
sdkconfig
file for your exampleCONFIG_LOG_DEFAULT_LEVEL_DEBUG=y
)Hello, @giorgi877!
Thank you for sending the issue report. I was investigating this issue but unfortunately could not reproduce it with the code snippet you provided on the ESP-IDF
v4.4
tag.Could you please attach the following -
- The
sdkconfig
file for your example- Debug logs (with
CONFIG_LOG_DEFAULT_LEVEL_DEBUG=y
)
Thank u for your reply!
but i can show u where is bug. It seems bug is in hardware RSA Accelerator. Works fine if i use MBED tls MPI(uncheck "enable hardware MPI(bignum) accelartion", fails in case of hardware RSA exponentiation and multiplication calculations. i have tried to set "fallback to software implemenation for larger MPI values" (in case of automatic swithing to software mbedTLS calculation, if number exceeds SOC_RSA_MAX_BIT_LEN value) but the result is the same, works fine only when i enable MBED tls MPI software calculations(uncheck "enable hardware MPI(bignum) accelartion" in sdkonfig->mbedTLS.
Below is the code where bug might be in(file: esp_bignum.c)
Hello, @giorgi877!
CONFIG_MBEDTLS_HARDWARE_MPI
enabled or disabled? (Sorry for the confusion)git describe --tags
in ${IDF_PATH}
.Thanks, Laukik.
Hello @laukik-hase ,
Thanks
Same here
Hello, @giorgi877 and @jammiejamjam!
Sorry for the late reply! I am still not able to reproduce the issue.
Could you try reproducing the issue with the esp_http_client
example and your configuration?
Does increasing the task watchdog timeout period at (Top) → Component config → ESP System Settings → ESP_TASK_WDT_TIMEOUT_S
help in your use case?
Also, some debug logs (with CONFIG_LOG_DEFAULT_LEVEL_DEBUG=y
) would be helpful.
I'm having a potentially similar issue. "Potentially" -- because I don't use https, just http, however, the rest is kinda similar.
I don't have a minimalistic enough example for copypaste, but in general the application is a busy loop polling mqtt (https://github.com/knolleary/pubsubclient) with occasional HTTP POST requests reporting some status. The http code is as follows:
logf("pre-http {}", url);
auto t_start = std::chrono::steady_clock::now();
esp_http_client_config_t conf{
.url = url.c_str(),
.method = HTTP_METHOD_POST,
.timeout_ms = 2000,
.buffer_size = 1024,
.is_async = true,
.keep_alive_enable = false,
};
auto cli = esp_http_client_init(&conf);
esp_http_client_set_post_field(cli, payload.data(), payload.size());
while(1){
auto err = esp_http_client_perform(cli);
if (err == ESP_ERR_HTTP_EAGAIN){
delay(10); // tried force-yielding, doesn't help at all
continue;
} else if (err == ESP_OK){
// logf("HTTP status = {}, content_length = {}",
// esp_http_client_get_status_code(cli),
// esp_http_client_get_content_length(cli));
} else {
logf("HTTP error {}", esp_err_to_name(err));
}
break;
}
esp_http_client_cleanup(cli);
logf("post-http {}", url);
auto dt = std::chrono::duration_cast<milliseconds>(std::chrono::steady_clock::now() - t_start).count();
if (dt > 200){
logf("HTTP request to {} took {} ms", url, dt);
}
I also tried without is_async
, same result.
I tried to narrow it down as much as the setup allows, here is what I've got so far:
Infortunately I'm bound to platformio and can't easily experiment with esp-idf config.
@laukik-hase
I just check the code in master tree, the esp_http_client_perform() returns ESP_ERR_HTTP_EAGAIN when esp_http_client_read / esp_http_client_fetch_headers timeout. What is the correct way to handle "esp_http_client_perform() returns ESP_ERR_HTTP_EAGAIN when is_async is false"? Just calling esp_http_client_perform() again? or any other actions required ?
BTW, the error message "Connection timed out before data was ready!" seems misleading because it's not a connection timeout, but a read/receive timeout.
Hello, @AxelLin!
For sync connections, if esp_http_client_performs()
returns ESP_ERR_HTTP_EAGAIN
, you will have to retry.
Hello, @AxelLin!
For sync connections, if
esp_http_client_performs()
returnsESP_ERR_HTTP_EAGAIN
, you will have to retry.
@laukik-hase The esp_http_client_fetch_headers() now has different behavior in different esp-idf branches. e.g. In master tree, it returns -ESP_ERR_HTTP_EAGAIN if esp_transport_read() returns 0. In v4.3, it returns ESP_FAIL if esp_transport_read() returns 0.
i.e. In previous esp-idf versions, esp_http_client_perform() returns ESP_ERR_HTTP_EAGAIN only when is_async is true.
So does that mean the older esp-idf versions need fix to return ESP_ERR_HTTP_EAGAIN if read timeout?
Hello, @giorgi877 and @jammiejamjam!
Sorry for the late reply! I am still not able to reproduce the issue.
Could you try reproducing the issue with the
esp_http_client
example and your configuration?
Looks like https://github.com/espressif/esp-idf/issues/9371 is the same issue.
@laukik-hase
I hit the issue: esp32-wroom-32e with v5.1-dev-286-gbcb34ca7ae
E (359745) task_wdt: Task watchdog got triggered. The following tasks/users did not reset the watchdog in time: E (359745) task_wdt: - IDLE (CPU 1) E (359745) task_wdt: Tasks currently running: E (359745) task_wdt: CPU 0: IDLE E (359745) task_wdt: CPU 1: http_cli E (359745) task_wdt: Aborting. E (359745) task_wdt: Print CPU 1 backtrace
Backtrace: 0x4014415f:0x3ffdea50 0x401441f1:0x3ffdea70 0x401445ee:0x3ffdea90 0x4014e9f5:0x3ffdeab0 0x4014ebf5:0x3ffdeae0 0x4014c612:0x3ffdeb50 0x4014d2f8:0x3ffdeb80 0x4014d358:0x3ffdebb0 0x4014ae3f:0x3ffdebd0 0x4014abda:0x3ffdec00 0x401418d4:0x3ffdec20 0x4013f8d1:0x3ffdecd0 0x4013f937:0x3ffdecf0 0x40134d35:0x3ffded10 0x401349c9:0x3ffded30 0x40134aa7:0x3ffded70 0x4013865a:0x3ffdeda0 0x4019b8fb:0x3ffdedd0 0x40139c20:0x3ffdedf0 0x40139eb1:0x3ffdee10 0x400db0a3:0x3ffdee30 0x400db6ba:0x3ffdefa0 0x4014415f: mpi_sub_hlp at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/bignum.c:1201 (inlined by) mbedtls_mpi_sub_abs at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/bignum.c:1239
0x401441f1: mbedtls_mpi_sub_mpi at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/bignum.c:1312
0x401445ee: mbedtls_mpi_mod_mpi at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/bignum.c:1815 (inlined by) mbedtls_mpi_mod_mpi at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/bignum.c:1799
0x4014e9f5: calculate_rinv at /home/axel/esp/esp-idf/components/mbedtls/port/esp_bignum.c:200 (discriminator 2)
0x4014ebf5: esp_mpi_exp_mod at /home/axel/esp/esp-idf/components/mbedtls/port/esp_bignum.c:388 (inlined by) mbedtls_mpi_exp_mod at /home/axel/esp/esp-idf/components/mbedtls/port/esp_bignum.c:460
0x4014c612: mbedtls_rsa_public at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/rsa.c:767
0x4014d2f8: mbedtls_rsa_rsassa_pkcs1_v15_verify at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/rsa.c:2158
0x4014d358: mbedtls_rsa_pkcs1_verify at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/rsa.c:2210
0x4014ae3f: rsa_verify_wrap at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/pk_wrap.c:93
0x4014abda: mbedtls_pk_verify_restartable at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/pk.c:316
0x401418d4: ssl_parse_server_key_exchange at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/ssl_cli.c:3215 (inlined by) mbedtls_ssl_handshake_client_step at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/ssl_cli.c:4233
0x4013f8d1: mbedtls_ssl_handshake_step at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/ssl_tls.c:5255 (inlined by) mbedtls_ssl_handshake_step at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/ssl_tls.c:5225
0x4013f937: mbedtls_ssl_handshake at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/ssl_tls.c:5317 (inlined by) mbedtls_ssl_handshake at /home/axel/esp/esp-idf/components/mbedtls/mbedtls/library/ssl_tls.c:5293
0x40134d35: esp_mbedtls_handshake at /home/axel/esp/esp-idf/components/esp-tls/esp_tls_mbedtls.c:195
0x401349c9: esp_tls_handshake at /home/axel/esp/esp-idf/components/esp-tls/esp_tls.c:83 (discriminator 15) (inlined by) esp_tls_low_level_conn at /home/axel/esp/esp-idf/components/esp-tls/esp_tls.c:437 (discriminator 15)
0x40134aa7: esp_tls_conn_new_sync at /home/axel/esp/esp-idf/components/esp-tls/esp_tls.c:464
0x4013865a: ssl_connect at /home/axel/esp/esp-idf/components/tcp_transport/transport_ssl.c:109
0x4019b8fb: esp_transport_connect at /home/axel/esp/esp-idf/components/tcp_transport/transport.c:123
0x40139c20: esp_http_client_connect at /home/axel/esp/esp-idf/components/esp_http_client/esp_http_client.c:1274
0x40139eb1: esp_http_client_open at /home/axel/esp/esp-idf/components/esp_http_client/esp_http_client.c:1429
I'm also getting this.
The earliest point where I've identified the misbehaviour is at this line here: https://github.com/espressif/esp-idf/blob/eea8629fa18a75755903474c8356268812f26a5e/components/mbedtls/port/esp_bignum.c#L642
Add this just below to confirm:
if(mpi_words(Z) != z_words) printf("Incorrect z_words; will now lock up\n");
Causing it to get stuck in this loop here: https://github.com/espressif/mbedtls/blob/7b428b1bf260ce1fec4e3ffb6494070d439a3a67/library/bignum.c#L1765-L1766
From what I can tell, typically z_words
== 66 in mpi_mult_mpi_failover_mod_mult
, so hw_words
== 80. The hardware does its thing and the 80 words are read out via esp_mpi_read_result_hw_op
. If all went well, the last 14 words of those 80 should be 0s. For some reason, very rarely, all 80 will be filled with junk(?) instead.
I'm having the same issue. Newest esp-idf (esp-idf 5.1). We have only sporadic wifi in our room so it could associated to that.
We are still investigating this issue.
If you would like to have quick workaround then you can disable CONFIG_MBEDTLS_HARDWARE_MPI
and that should fix the problem.
Highlighting few points based on our investigations so far:
8.4.0
compiler then we could not reproduce it. It is highly likely that toolchain (compiler) upgrade in IDF 5.x has some role to play here.We are evaluating this further and keep you posted on the progress.
@mahavirj Indeed disabling hardware MPI seems to fix things. We didn't have any issues so far until now. I'm also using IDF5.x and I can't remember this problem happened before.. but I could be wrong.
@giorgi877 @AxelLin @boarchuz @MartyMcFlyInTheSky
Update:
So far our analysis shows culprit as the code responsible to zero initialize MPI peripheral memory block here:
This code is not behaving as per the expectation. Erroneously some memory location here still contains the garbage value. This mainly happens when compiler converts this loop into memset
call (from ROM library) with optimization flags enabled. This started happening with latest toolchain from ESP-IDF v5.x release.
We are evaluating 2 fixes here:
Use DPORT_WRITE_REG
(volatile writes) wrappers to write to the MPI peripheral. This would have slight performance penalty, we are still analyzing the impact. 0001-esp32-mpi-use-DPORT-wrappers-to-write-register-data.zip
Disable compiler -ftree-loop-distribute-patterns
flag for the specific routine. It appears like this flag has been default enabled with GCC 10.x, causing library calls to be used (e.g., call to memset
) 0001-esp32-mpi-fix-issue-with-data-corruption-for-release.zip
Note: Please apply either of the fix using git am -p0 <file_name>
Please try out these fixes and let us know if you still run into any issues. This will help to confirm our analysis.
@giorgi877 Do you still hit the issue on v4.4? (The fix is not for v4.4.)
@AxelLin, I'v fixed above issue my own, will post an udate later. Thanks.
I am find any question related to this problem: https://stackoverflow.com/questions/46996893/gcc-replaces-loops-with-memcpy-and-memset https://stackoverflow.com/questions/6410595/getting-gcc-to-compile-without-inserting-call-to-memcpy/33818680#33818680 Does the memset or memcpy do not work as we want is memcpy() can not find in this find running? (I am a runoob in compiler)
@mahavirj Just wanted to mention, in light of https://github.com/espressif/esp-idf/issues/10403, that the fix seems to be effective for my case. Thankyou.
Environment
xtensa-esp32-elf-gcc --version
to find it): // 1.22.0-80-g6c4433aProblem Description
Hello, i have started HTTPS client(async mode) according to examples. After a few minutes of working console prints that task WDT triggered and back trace info. when i disable hardware MPI it seems to works.
Decoding stack results 0x400848f5: esp_crosscore_isr at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/esp_system/crosscore_int.c line 92 0x4010028d: mbedtls_mpi_sub_abs at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/mbedtls/library/bignum.c line 1388 0x40100385: mbedtls_mpi_sub_mpi at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/mbedtls/library/bignum.c line 1507 0x40100939: mbedtls_mpi_mod_mpi at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/mbedtls/library/bignum.c line 1963 0x4010cb91: calculate_rinv at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/port/esp_bignum.c line 129 0x4010cde8: mbedtls_mpi_exp_mod at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/port/esp_bignum.c line 99 0x4010a98b: mbedtls_rsa_public at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/mbedtls/library/rsa.c line 789 0x4010bd91: mbedtls_rsa_rsassa_pkcs1_v15_verify at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/mbedtls/library/rsa.c line 2426 0x4010be3d: mbedtls_rsa_pkcs1_verify at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/mbedtls/library/rsa.c line 2493 0x40108f46: rsa_verify_wrap at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/mbedtls/library/pk_wrap.c line 113 0x40108c89: mbedtls_pk_verify_restartable at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/mbedtls/library/pk.c line 309 0x400ff221: mbedtls_ssl_handshake_client_step at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/mbedtls/library/ssl_cli.c line 2904 0x400fd42d: mbedtls_ssl_handshake_step at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/mbedtls/library/ssl_tls.c line 8358 0x400fd44c: mbedtls_ssl_handshake_step at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/mbedtls/library/ssl_tls.c line 8363 0x400fd472: mbedtls_ssl_handshake at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/mbedtls/mbedtls/library/ssl_tls.c line 8377 0x400f8f65: esp_mbedtls_verify_certificate at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/esp-tls/esp_tls_mbedtls.c line 273 0x400f8955: esp_tls_low_level_conn at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/esp-tls/esp_tls.c line 425 0x400f8d39: esp_tls_conn_new_async at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/esp-tls/esp_tls.c line 511 0x401404fb: esp_tls_connect_async at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/tcp_transport/transport_ssl.c line 85 0x40140531: ssl_connect_async at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/tcp_transport/transport_ssl.c line 96 0x40157ae5: esp_transport_connect_async at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/tcp_transport/transport.c line 181 0x400f9e39: esp_http_client_connect at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/esp_http_client/esp_http_client.c line 1210 0x400fab08: esp_http_client_perform at C:/Users/GIORGI877/eclipse/esp-idf-v4.4/components/esp_http_client/esp_http_client.c line 1051 0x400d77a3: httpClientTask at ../main/main.c line 282
Expected Behavior
Works, if i disable hardware MPI form MBEDTLS config
Actual Behavior
Task WDT trigger and core hung
Steps to reproduce
Disable hardware MPI
Code to reproduce this issue
Debug Logs
Other items if possible