Closed SteveOfTheStow closed 7 years ago
You're certainly checking the return value of malloc()
to make sure it's not NULL
, right? The chance for it to fail on a Mac as compared to an ESP32 is somewhat different. The structure can be relatively large depending on many #ifdef
s (300 bytes or so)?.
And since you didn't indicate that it ever worked with your config and without your modification on the ESP32, we cannot be sure that you're not simply facing a stack overflow, either. (Even if it works without your modification, a potential stack overflow can go unnoticed in one case and mess up everything in the other.)
Yep, definitely getting a valid pointer back. And it does work if I put the mbedtls_ssl_context struct on the stack.
Here's a gist: https://gist.github.com/SteveOfTheStow/d8109991384f032edcb2de406a78e7a4
Again, the stack could be corrupted, and it could have a different effect in both cases due to the different stack layout, so is your build configured for stack overflow, or not?
I've got "Check by stack pointer value".
Note that the same occurs if I turn off Checking.
Did you include the .pem file? Also can you do the backtrace with xtensa-esp32-elf-addr2line? There are mbedtls examples. Before you made ssl a pointer, the program worked? OpenSSL component in esp-idf uses malloc'ed structures of mbedtls.
I was using the built-in mbedtls test cert, but I've just tried a separate pem using COMPONENT_EMBED_FILES and it's the same backtrace.
Here's the output of IDF Monitor:
Guru Meditation Error of type LoadProhibited occurred on core 0. Exception was unhandled.
Register dump:
PC : 0x4008327b PS : 0x00060333 A0 : 0x800833d8 A1 : 0x3ffd68b0
0x4008327b: prvInsertBlockIntoFreeList at ~/bin/espressif/esp32/esp-idf/components/freertos/./heap_regions.c:410
A2 : 0x3ffc868c A3 : 0x00000000 A4 : 0xff000000 A5 : 0x80ffffff
A6 : 0x00000005 A7 : 0x00000000 A8 : 0x00000000 A9 : 0x00000000
A10 : 0x3ffc0b68 A11 : 0x00000000 A12 : 0x3ffca398 A13 : 0x00000000
A14 : 0x00000000 A15 : 0x3ffd69a0 SAR : 0x00000004 EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000000 LBEG : 0x400014fd LEND : 0x4000150d LCOUNT : 0xfffffffe
Backtrace: 0x4008327b:0x3ffd68b0 0x400833d8:0x3ffd68d0 0x4008624b:0x3ffd68f0 0x4008628c:0x3ffd6910 0x40081a94:0x3ffd6930 0x4000bef8:0x3ffd6950 0x401234ec:0x3ffd6970 0x40115c6a:0x3ffd69c0 0x400def86:0x3ffd6a10 0x4008327b: prvInsertBlockIntoFreeList at ~/bin/espressif/esp32/esp-idf/components/freertos/./heap_regions.c:410
0x400833d8: pvPortMallocTagged at ~/bin/espressif/esp32/esp-idf/components/freertos/./heap_regions.c:410
0x4008624b: pvPortMallocCaps at~/bin/espressif/esp32/esp-idf/components/esp32/./heap_alloc_caps.c:414
0x4008628c: pvPortMalloc at ~/bin/espressif/esp32/esp-idf/components/esp32/./heap_alloc_caps.c:414
0x40081a94: _calloc_r at ~/bin/espressif/esp32/esp-idf/components/newlib/./syscalls.c:56
0x401234ec: mbedtls_pem_read_buffer at ~/bin/espressif/esp32/esp-idf/components/mbedtls/library/pem.c:332 (discriminator 1)
0x40115c6a: mbedtls_x509_crt_parse at ~/bin/espressif/esp32/esp-idf/components/mbedtls/library/x509_crt.c:1253
0x400def86: ssl_task at ~/Dev/src/Practice/Platform_Specific/ESP32/mbedtls_test/main/./main.c:112
I'm sure there's some way to get this to work using mbedtls_ssl_context as a pointer. It does work before I made ssl a pointer, yep.
Searching for prvInsertBlockIntoFreeList, I found this, which provides a good theory about what the malloc might be up to, though I don't understand the 'why' or how to work around it yet.
@SteveOfTheStow again, Before you made ssl a pointer, the program worked?
As that post you reference suggests, it might be a heap corruption issue. You should check the rest of your code for malloc'ed structs you manipulate that might go out of bounds.
Yes, it works before I made ssl a pointer.
I compared the sdkconfig for both the referenced mbedtls client sample in ESP-IDF and, and the project I created using mbedtls's own c file sample. The consequence is I've been able to get mbedtls's c file running with mbedtls_ssl_context on the heap by changing the following settings from the IDF defaults:
CONFIG_FREERTOS_THREAD_LOCAL_STORAGE_POINTERS=1 (was 3)
// Enable FreeRTOS to use multiple cores
CONFIG_INT_WDT_CHECK_CPU1=y
CONFIG_TASK_WDT_CHECK_IDLE_TASK_CPU1=y
# CONFIG_FREERTOS_UNICORE is not set
I don't know why this is.
Unfortunately the program I'm writing that uses mbedtls still suffers from the issue of failing on parsing certs, as described above (a couple of lines of backtrace are copied below) after these modifications are made.
0x401234ec: mbedtls_pem_read_buffer at ~/bin/espressif/esp32/esp-idf/components/mbedtls/library/pem.c:332 (discriminator 1)
0x40115c6a: mbedtls_x509_crt_parse at ~/bin/espressif/esp32/esp-idf/components/mbedtls/library/x509_crt.c:1253
Could you share the source code? Ill try to reproduce.
99% of the code is just using ESP-IDF right now so I'd just need to change a bunch of entity names and I could probably share it sometime this week.
I'm still on the memory corruption track.
When exactly does your sever start as compared to WiFi?
I'm not using any malloc()ed stuff in my own code so I that could be a reason why I do not encounter any runtime problems except the failing connect. Actually, since all my memory gets allocated before WiFi starts, there's no chance for the WiFi lib to access any memory which was freed and now belongs to me (rather, it will mess up its own former memory, which I never used).
In your case, depending on the startup sequence, it's possible that the root cause is actually WiFi going wild, accessing memory via a pointer to formerly own memory which is now yours.
If you allocate that memory before everything else and pass it to your server's main()
, does that change anything?
Here's a sample version of the app I'm building: https://github.com/SteveOfTheStow/esp_mbedtls_test
Guru Meditation Error of type LoadProhibited occurred on core 0. Exception was unhandled. Register dump: PC : 0x400832e3 PS : 0x00060033 A0 : 0x80083440 A1 : 0x3ffda3f0
0x400832e3: prvInsertBlockIntoFreeList at /Users/steveofthestow/bin/espressif/esp32/esp-idf/components/freertos/./heap_regions.c:410A2 : 0x3ffdaa6c A3 : 0x00000000 A4 : 0xff000000 A5 : 0x80ffffff
A6 : 0x00000022 A7 : 0x00000000 A8 : 0x00000000 A9 : 0x00000000
A10 : 0x3ffc0b08 A11 : 0x00000001 A12 : 0x3ffca2b8 A13 : 0x00000000
A14 : 0x00000000 A15 : 0x00000000 SAR : 0x00000004 EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000000 LBEG : 0x400014fd LEND : 0x4000150d LCOUNT : 0xfffffffeBacktrace: 0x400832e3:0x3ffda3f0 0x4008343d:0x3ffda410 0x40086af0:0x3ffda430 0x40086b31:0x3ffda450 0x40081a99:0x3ffda470 0x4000beaf:0x3ffda490 0x400df883:0x3ffda4b0 0x400832e3: prvInsertBlockIntoFreeList at /Users/steveofthestow/bin/espressif/esp32/esp-idf/components/freertos/./heap_regions.c:410
0x4008343d: pvPortMallocTagged at /Users/steveofthestow/bin/espressif/esp32/esp-idf/components/freertos/./heap_regions.c:410
0x40086af0: pvPortMallocCaps at /Users/steveofthestow/bin/espressif/esp32/esp-idf/components/esp32/./heap_alloc_caps.c:372
0x40086b31: pvPortMalloc at /Users/steveofthestow/bin/espressif/esp32/esp-idf/components/esp32/./heap_alloc_caps.c:306
0x40081a99: _malloc_r at /Users/steveofthestow/bin/espressif/esp32/esp-idf/components/newlib/./syscalls.c:27
0x400df883: uart_event_task at /Users/steveofthestow/Dev/src/External/steve-of-the-stow-esp-mbedtls-sample/main/./alpha_app_uart_connection.c:39
@HubbyGitter Thanks for the tip. Not sure I could allocate everything before starting WiFi, especially bits that depend on it.
@SteveOfTheStow To clarify, my suggestion was meant for helping to find the root cause, not as a solution for a working system.
Fair. Was actually fine to swap around starting the SSL/UART connectivity and the WiFI, and it still broke in the same place alas.
To work around this bug, remove the call to #include "mbedtls/config.h"
from your project's source files. You will need to move #include "mbedtls/ssl.h
to the top of the list of mbedTLS headers to avoid errors in some of the other headers (like mbedtls/certs.h
).
Why is this a bug? IDF ships its own mbedTLS config header, esp_config.h
. This header is correctly included recursively from inside headers like mbedtls/ssl.h
, but we still ship the default config header in "mbedtls/config.h". This means any source file which includes mbedtls/config.h
directly gets the default configuration, but mbedTLS libraries are compiled with the esp_config.h
configuration.
The size of various mbedTLS structures depends on the config items enabled. sizeof(mbedtls_ssl_context)
is less under mbedtls/config.h
than under esp_config.h
. So the allocated buffer was being overrun when mbedtls_ssl_init(ssl)
was called.
This manifests as a crash due to heap corruption. The same memory corruption still happens when mbedtls_ssl_context is allocated on the stack, it just happens to not corrupt any stack memory in a way that causes a crash.
I'm going to leave this Issue open for now because we should either remove the default config header, or provide a way to detect if it's accidentally included directly into a source file.
PS While looking for this I noticed another memory corruption bug here:
https://github.com/SteveOfTheStow/esp_mbedtls_test/blob/master/main/alpha_app_ssl_server.c#L340
taskName
needs to one be one byte longer to account for the terminating NULL byte. As written, strcat()
overflows the buffer. This pattern seems to be repeated in at least one other place.
Fix for the underlying issue is coming (including "mbedtls/config.h" directly will include the correct configuration.)
Many thanks!
(Also posted on esp32.com but not approved yet so cross-posted here)
I'm trying to use mbedtls in the ESP-IDF and have grabbed the mbedtls server sample program from here. It largely maps fine into an ESP32 model. The one major change I tried to do was to allocate mbedtls_ssl_context on the heap (I need this to use mbedtls in a larger program I'm building), and this seems to blow up the program when it tries to parse certs. I've tried this change on the sample and run on macOS; works fine.
Change:
mbedtls_ssl_context ssl;
becomesmbedtls_ssl_context *ssl = malloc(sizeof(mbedtls_ssl_context));
(and all the uses of &ssl become ssl)
Logs from ESP32:
The call that causes things to go bang is:
ret = mbedtls_x509_crt_parse( &srvcert, (const unsigned char *) cacert_pem_start, cacert_pem_bytes );
which doesn't even use the allocated struct.
After a stop in x509_crt.c: 1015, we get to:
then:
and further into the system.