Closed 0x0fe closed 1 year ago
and so to make sure the problem actually comes from my sdkconfig, i made a test firmware with arduino, where the sdkconfig has less optimisation, and sure enough, the OTA works here, it could perform the OTA and reboot.
Stack Space: 7260
Used PSRAM: 0
init_sd
Type: SDHC
Total: 7678MB
Used: 128MB
WiFi init
.
link up
IP: 192.168.1.81
fw rev 1.0.0
Stack Space: 6396
Used PSRAM: 29336
mac: 5443b26ed0a8
uuid: c3e11dd29084d45c09cfa75650b1778e
ts: 421686965
key: ibola5Xe6WKm3NX+wxmxJAzsb2oUQVDSO08gMXetDW4=
connecting mqtt
connected
sub to XXXXXXXX/5443b26ed0a8/XXXXXXX/DOWNLOAD/+
sub to XXXXXXXX/5443b26ed0a8/XXXXXXX/UPDATE
Starting OTA
Http Event On Connected
Http Event Header Sent
Http Event On Header, key=Content-Length, value=1319440
Http Event On Header, key=Content-Type, value=application/octet-stream
Http Event On Header, key=Server, value=Microsoft-HTTPAPI/2.0
Http Event On Header, key=api-supported-versions, value=1.0
Http Event On Header, key=Content-Disposition, value=attachment; filename=firmware.bin; filename*=UTF-8''firmware.bin
Http Event On Header, key=Strict-Transport-Security, value=max-age=15724800; includeSubDomains
Http Event On Header, key=Date, value=Wed, 14 Jun 2023 13:04:26 GMT
Http Event Disconnected
Http Event Disconnected
OTA success, rebooting
[ 27917][W][WiFiGeneric.cpp:1057] _eventCallback(): Reason: 8 - ASSOC_LEAVE
ets Jul 29 2019 12:21:46
i tried to unset
# CONFIG_SPI_MASTER_ISR_IN_IRAM is not set
# CONFIG_SPI_SLAVE_ISR_IN_IRAM is not set
But it did not help. I must precise i have the following flags set (between others)
CONFIG_WIFI_LWIP_ALLOCATION_FROM_SPIRAM_FIRST=y
CONFIG_COMPILER_OPTIMIZATION_SIZE=y
CONFIG_ESP32_ECO3_CACHE_LOCK_FIX=y
CONFIG_ESP32_REV_MIN_3=y
CONFIG_ESP32_REV_MIN=3
CONFIG_SPIRAM_USE_CAPS_ALLOC=y
CONFIG_SPIRAM_TRY_ALLOCATE_WIFI_LWIP=y
CONFIG_SPIRAM_ALLOW_BSS_SEG_EXTERNAL_MEMORY=y
CONFIG_SPIRAM_ALLOW_NOINIT_SEG_EXTERNAL_MEMORY=y
CONFIG_ESP_EVENT_POST_FROM_ISR=y
CONFIG_ESP_EVENT_POST_FROM_IRAM_ISR=y
CONFIG_ESP_SYSTEM_RTC_EXT_XTAL=y
CONFIG_FREERTOS_PLACE_FUNCTIONS_INTO_FLASH=y
CONFIG_FREERTOS_ENABLE_TASK_SNAPSHOT=y
CONFIG_FREERTOS_PLACE_SNAPSHOT_FUNS_INTO_FLASH=y
CONFIG_MBEDTLS_EXTERNAL_MEM_ALLOC=y
CONFIG_MBEDTLS_ASYMMETRIC_CONTENT_LEN=y
CONFIG_MBEDTLS_SSL_IN_CONTENT_LEN=16384
CONFIG_MBEDTLS_SSL_OUT_CONTENT_LEN=4096
CONFIG_MBEDTLS_DYNAMIC_BUFFER=y
CONFIG_MBEDTLS_DYNAMIC_FREE_PEER_CERT=y
and also i use this, to force allocating on SPIRAM as much as possible.
heap_caps_malloc_extmem_enable(500);
when i modified the heap caps malloc to 4096
heap_caps_malloc_extmem_enable(4096);
The OTA suceeded, however this is problematic because setting the heap cap malloc to 500 or 512 is what helped most to keep enough free IRAM to be able to run A2DP sink, HTTPS, MQTTS and MP3 decoding/I2S working together. With 4096 i think it wont be possible, something should be changed in the HTTPS OTA to be able to run normally when heap caps malloc is lower than 4096.
It is not easy to handle so many memory consuming features on embedded system with limited ram. Maybe you should consider to disable mqtt, mp3 or a2dp during OTA?
@chegewara Yes, it is close to impossible. Sadly we cannot release the memory of A2DP sink https://github.com/espressif/esp-idf/issues/11642 (I have to admit i dont understand the logic behind this).
Maybe you should consider to disable mqtt, mp3 or a2dp during OTA?
This is what i tried, to no avail, i suspect disconnecting MQTT does not release ressources.
Besides in this test i ended A2DP with the argument to true (release ressources), which could not be used in production, because the A2DP has to be restarted later. Anyway, it did not help.
Basically in the normal life cycle of the device it will stream play files from HTTPS while downloading others from HTTPS or do playback to I2S via A2DP sink, or send audio to other device (a2dp source) or perform OTA. MQTTS has to stay active at all time, it can however be disabled for specific actions, like OTA. Ihave yet to find the right way to free ressources.
So far with all optimisations in the sdkconfig and heap cap malloc to 500 my memory usage was at 60% in the vscode analysis. Which is fine, but without all these optimisations it could not even build. I have yet to make the analysis with heap cap set to 4096, but i can tell it builds at least.
mqtt.disconnect();
a2dp_sink.end(true);
Again, you dont need a2dp during and after OTA. The usual use case is to restart esp32 after OTA, so you can re-init a2dp after reset. Also, im not saying you can or should disconnect mqtt, just to delete all tasks you are running. Of course i am assuming you are running multiple tasks and every task has allocated own stack which usually is around 3kB. Maybe you mp3/i2s code is also allocating some buffers, which are not needed during OTA. You should to evaluate code and to see where you can free memory before OTA starts.
I am evaluating the code since several days and performed tens of memory map analysis, actually. A2DP cannot free ressources, this is a flaw of the SDK and so far there is nothing i can do about it, if i free the ressources it cannot be restarted, this is documented. Regarding the other tasks, MQTT does not run on a separate task, it runs in the idle task which also performs other jobs like the adkeys handling and some other event driven actions. The file download task for example is started when needed and stopped at the end of the download. The audio playback is currently in the idle task as well. So there is no task overhead. You know, simply using HTTPS and A2DP consumes most of the IRAM with default SDKconfig settings. They are the biggest consumers by very far.
N.B. stopping A2DP with ressource release and not being able to restart it is not a problem for the OTA since it will reboot anyway s i can do it here, but it is a problem for all other cases, where i really would like to free A2DP ressource when not in use, during HTTPS streaming and file download in particular.
@0x0fe OTA takes quite some memory and it can not be released without stopping the service, however it needs to be running after the system has started to do the FW verification. In my case i can just simply stop that at a specific event. Despite that i was still facing the symptom you describe. In my case the solution was that i simply restart the ESP32 regularly when there is no user activity so it does not affect anything in my case. Not sure if this would fit your case.
@idea--list this has been solved by keeping the heap caps malloc to 4096, and stripping further things in the sdkconfig. I cannot "restart the system regularly" on a production firmware. Are you sure the OTA is working without releasing its ressource after system reset? If so it is lauched on its own, because on my side there is nothing starting any OTA function after a reset, the OTA is started manually when it needed.
Thanks for reporting and sharing the updates, feel free to reopen.
Answers checklist.
IDF version.
V4.4.4
Operating System used.
Windows
How did you build your project?
VS Code IDE
If you are using Windows, please specify command line type.
None
Development Kit.
ESP32DWDR2
Power Supply used.
USB
What is the expected behavior?
Being able to perform the OTA.
What is the actual behavior?
Crashes
Steps to reproduce.
Start A2DP sink (no need to connect a device to it) Start an MQTT secured connection Starts HttpsOTAUpdate
Debug Logs.
More Information.
So, to give some context, i use the ESP32D0WR2 which has internal 2MB PSRAM, and quite few options had to be adjusted in menuconfig to be able to run both classic bluetooth A2DP, wifi, Secure MQTT plus helix MP3 decoder. I highly suspect that the OTA problem is caused by one of the option i modified to reduce IRAM usage, i did not push too far but still, basically i set all options allowing to allocate on external SPIRAM whenever possible, enable dynamic allocation for wifi, and adjusted frew more minor things such a nanolib. My SDKconfig is below. Before i had to make all these changes the OTA was working, but it was not possible to run A2DP along with everything else. after the changes i am able to run everything with some IRAM left, but i got this crash when trying to perform OTA, so far other functions work fine, Secure MQTT works fine, the audio player from sd_mmc work fine, A2DP sink works ok. I also precise that i do use SPI for nfc, it is not doing any transaction at the time OTA starts but the driver has been used earlier and is initilised, i precise this because i know it may have something to do with ISR calls, stil l idont understand why my cache would be disabled. Also, before starting the OTA i stop MQTT (mqtt.disconnect) and i stop the A2DP sink, releasing ressources.