Closed leokeba closed 7 months ago
@leokeba you could try to comment out the line 80 in DownloadFirmwareService.cpp
. I suspect, that this is causing the actual crash. You'll get a more verbose error message on the serial log if it doesn't crash with SSE.
@leokeba can you please post some more details? Which ESP are you using? I just tested it with an ESP32-S3 and it validated all 48 certificates. To me it looks like somehow the certificate creation process produced garbage. Can you please check your build log for these lines?
SSL Cert Store: Certificate bundle downloaded to: ssl_certs\cacert.pem
SSL Cert Store: Parsing certificates from ssl_certs\cacert.pem
SSL Cert Store: Successfully added 48 certificates
SSL Cert Store: Successfully added 48 certificates in total
SSL Cert Store: Successfully created src\certs\x509_crt_bundle.bin
For me the OTA process works through the complete SSL part flawless. It receives the redirect from github and starts to download the binary. You can try to just use the single certificate used by github, instead of the bundle.
In my test SSL increased the heap by ~70kb. But still enough headroom.
However, it crashes when using SSE to notify the frontend about the progress, too. But this part is due for a rewrite anyway #29.
I'm using a good old esp32-dev "WROOM" module. Here's the relevant build log part :
SSL Cert Store: Certificate bundle downloaded to: ssl_certs/cacert.pem
SSL Cert Store: Parsing certificates from ssl_certs/cacert.pem
SSL Cert Store: Successfully added 48 certificates
SSL Cert Store: Successfully added 48 certificates in total
SSL Cert Store: Successfully created src/certs/x509_crt_bundle.bin
CONFIGURATION: https://docs.platformio.org/page/boards/espressif32/esp32dev.html
PLATFORM: Espressif 32 (6.4.0) > Espressif ESP32 Dev Module
HARDWARE: ESP32 240MHz, 320KB RAM, 4MB Flash
You are right about the crash, I commented out line 80 and it doesn't crash anymore. Here's the relevant serial output :
[ 39947][D][HTTPClient.cpp:598] sendRequest(): request type: 'GET' redirCount: 0
[ 39959][V][ssl_client.cpp:62] start_ssl_client(): Free internal heap before TLS 88152
[ 39960][V][ssl_client.cpp:68] start_ssl_client(): Starting socket
[ 39985][V][ssl_client.cpp:146] start_ssl_client(): Seeding the random number generator
[ 39987][V][ssl_client.cpp:155] start_ssl_client(): Setting up the SSL/TLS structure...
[ 39992][V][ssl_client.cpp:190] start_ssl_client(): Attaching root CA cert bundle
[ 39997][V][ssl_client.cpp:254] start_ssl_client(): Setting hostname for TLS session...
[ 40005][V][ssl_client.cpp:269] start_ssl_client(): Performing the SSL/TLS handshake...
[ 40969][D][esp_crt_bundle.c:108] esp_crt_verify_callback(): 48 certificates in bundle
[ 40972][E][esp_crt_bundle.c:74] esp_crt_check_signature(): PK verify failed with error FFFFBD70
[ 40975][E][esp_crt_bundle.c:147] esp_crt_verify_callback(): Failed to verify certificate
[ 40988][E][ssl_client.cpp:37] _handle_error(): [start_ssl_client():273]: (-12288) X509 - A fatal error occurred, eg the chain is too long or the vrfy callback failed
[ 40997][E][WiFiClientSecure.cpp:144] connect(): start_ssl_client: -12288
[ 41007][V][ssl_client.cpp:321] stop_ssl_socket(): Cleaning SSL connection.
[ 41011][D][HTTPClient.cpp:1163] connect(): failed connect to github.com:443
[ 41017][W][HTTPClient.cpp:1483] returnError(): error(-1): connection refused
[ 41024][E][HTTPUpdate.cpp:234] handleUpdate(): HTTP error: connection refused
[ 41031][D][HTTPClient.cpp:408] disconnect(): tcp is closed
[ 41037][V][ssl_client.cpp:321] stop_ssl_socket(): Cleaning SSL connection.
[ 41044][E][DownloadFirmwareService.cpp:82] updateTask(): [Download OTA] HTTP Update failed with error (-1): HTTP error: connection refused
HTTP Update failed with error (-1): HTTP error: connection refused
[ 59591][I][PsychicHttpServer.cpp:261] closeCallback(): [psychic] Client disconnected 53
Doesn't seem to get much more verbose, would there be a way for me to validate if the certificate bundle I'm making is correct ?
You could try to change the bundle creation in platformio.ini to board_ssl_cert_source = folder.
Hi, I had time to investigate a bit more and made some progress. The main culprit looks more and more like a memory issue, the requirements specified by the Mbed TLS documentation most likely don't take into account lots of other things happening on the MCU at the same time.
After lots of trial and error, I was able to make it work once using the latest version of template project. I had to remove all unnecessary features (security, analytics...), which helped it pass all the TLS handshakes with github, but then it would still crash during download. Finally, I commented out all notificationEvents in DownloadFirmwareService.cpp, and I had to close the browser tab very quickly after triggering the download so there was no websocket left open or any HTTP request going through during the download.
The handshake errors I'm getting still seem quite random, but there are some tendencies. I get a clear memory allocation issue if there is less than 90kb of free heap available before triggering the download. Between 90 and 120kb I either get the same error as before, something about a BigInt or the memory allocation error. Above 120kb, most of the time it gets through everything but crashes during download.
I suspect that this behaviour could be related to #39, and as you said #29 could very well improve the situation as well. I will try again as soon as there is good progress on these fronts.
In the end, my goal here would be to find stable minimum requirements, and maybe implement a check triggering a warning or an error if those requirements are not met.
Have you tried: board_ssl_cert_source = folder
in platformio.ini?
Could maybe ease the memory demand if only one certificate needs to be loaded.
If it aborts during download it could be related to task priorities. Have you running anything in the loop()-function or any other task on core 1?
I'm using the base template from the current main branch for most of my testing around this issue so no loop task. I just change the repo url so I can get a firmware to download for the right board version. In this context, I was able to get it to work reliably by commenting out all _notificationEvents->send
calls in DownloadFirmwareService.cpp
.
On my personal project fork it took me a little more research, but after a few rabbit holes and dead ends I found out all I had to do was to upgrade platformio to the latest espressif32
platform release (6.6.0) and it suddenly worked. Maybe that's because it also somehow made appear almost 50kb of free heap, but still nice bonus if not. I still have to leave the notificationEvents
commented out though.
Changing board_ssl_cert_source
did not seem to have a significant effect in my testing by the way.
Good point. I'll fix the platform version in the next round. Seems I'm still on 6.4.0.
Anyway, problem is solved. The bug with the notification events will be address by #29 anyway, soon.
Hi, I've been trying to add OTA support to my project via github releases, but as soon as the TLS handshake starts I get a crash.
The error
FFFFBD70
seems to point to a memory issue like running out of heap space, but several facts make this a little suspicious. First, I have about 90kB of free heap available before the handshake, and the documentation suggest I need about 42kB in the worst-case scenario. Also, I once had an error clearly indicating I was running out of heap, and this is a different one. I also tried adding the flags suggested here to sdkconfig.defaults, but that did not have an effect.I will keep investigating this issue myself, but if anybody has a clue about where to look I'll take it.
Thanks.