Closed djsomi closed 4 months ago
I had the same behavior, but could not reproduce it. I tried in the morning all kinds of stuff, but could not pinpoint it, yet. Maybe someone will find it helpful anyway:
1) The behavior did occur in 4.1.4 as well! It was not introduced recently, my post is 2 weeks old and I had 4.1.4 at the time 2) The panel was fixed, i.e. put back in operation without the proxy, by a serial flash, but since I have no development NSPanel and use them for heating, I did not try further with a panel while having -10°C outside... 3) Today, I used a development ESP32 board with build-in serial usb chip in an attempt the reproduce the behavior, i.e. no response via WIFi at all. It would always work as desired!
I used the following yaml file
substitutions:
# Settings - Editable values
device_name: "proxytest"
wifi_ssid: !secret wifi_ssid
wifi_password: !secret wifi_password
nextion_update_url: "http://homeassistant.local:8123/local/nspanel_eu.tft" # Optional for `esp-idf` framework
# Add-on configuration (if needed)
heater_relay: "1" # Possible values: "1" or "2"
# Customization area
##### My customization - Start #####
##### My customization - End #####
# Core and optional configurations
packages:
remote_package:
url: https://github.com/Blackymas/NSPanel_HA_Blueprint
ref: main
files:
- nspanel_esphome.yaml # Core package
# Optional advanced and add-on configurations
# - advanced/esphome/nspanel_esphome_advanced.yaml
# - nspanel_esphome_addon_climate_cool.yaml
- nspanel_esphome_addon_climate_heat.yaml
# - nspanel_esphome_addon_climate_dual.yaml
refresh: 300s
esp32:
framework:
type: esp-idf
# Enable Bluetooth proxy
bluetooth_proxy:
active: true
# Set Wi-Fi power save mode to "LIGHT" as required for Bluetooth on ESP32
wifi:
power_save_mode: LIGHT
fast_connect: true
esp32_ble_tracker:
#bluetooth_proxy:
# active: true
I tried with and without embedded climate (heat), proxy active: true
or not, fast_connect: true
or not. The obvious thing is the lack of a Nextion touchscreen. The flash is close to 90%.
RAM: [== ] 17.6% (used 57676 bytes from 327680 bytes)
Flash: [========= ] 89.6% (used 1643949 bytes from 1835008 bytes)
I am 98% sure, that I was on esp-idf
from the beginning, but since I made myself familiar with the project only 2.5 weeks ago, I can not say that for sure. I used to compile in ESPHome in HA which runs on a 4GB VM, but now use the standalone ESPHome on an M1 Mac. Today, I flashed via serial the first time and used OTA in every flash afterwards. However, it is still connected to the desktop for power supply.
I could duplicate this when using BT and add-on climate simultaneously, and I agree this is most likely related to the memory usage, as that was an issue already with arduino
even without BT, but when using too much memory.
Base version | Framework | Add-ons | Customizations | RAM | Flash | Comments |
---|---|---|---|---|---|---|
v4.2.5dev | esp-idf |
upload_tft removed |
- | 9.5% | 52.9% | Working fine |
v4.2.5dev | esp-idf |
- | - | 10.2% | 61.8% | Working fine |
v4.2.5dev | esp-idf |
- | web_server |
10.2% | 63.6% | Working fine |
v4.2.5dev | arduino |
- | - | 14.1% | 70.0% | Working fine |
v4.2.5 | arduino |
- | web_server |
14.2% | 72.8% | Working fine |
v4.2.5dev | esp-idf |
upload_tft removed |
bluetooth_proxy |
16.9% | 79.0% | Working fine |
v4.2.5dev | esp-idf |
climate_dual upload_tft removed |
bluetooth_proxy |
16.9% | 80.7% | Working fine |
v4.2.5dev | esp-idf |
climate_dual upload_tft removed |
bluetooth_proxy web_server |
16.9% | 83.6% | Working fine |
v4.2.5dev | esp-idf |
- | bluetooth_proxy |
17.5% | 87.6% | Bricked |
v4.2.2 | esp-idf |
- | bluetooth_proxy |
17.6% | 87.1% | Bricked |
v4.2.5dev | esp-idf |
climate_dual |
bluetooth_proxy |
17.6% | 89.3% | Bricked |
v4.2.5dev | esp-idf |
climate_dual |
bluetooth_proxy web_server |
17.6% | 91.4% | Bricked |
v4.2.5 | arduino |
- | bluetooth_proxy |
17.9% | 110.0% | Cannot build - Flash memory exceeded |
v4.2.5 | arduino |
- | bluetooth_proxy web_server |
17.9% | 110.9% | Cannot build - Flash memory exceeded |
I've to flash via serial all the devices that got bricked on the testes above, then I will run more tests, but I believe this option where upload_tft
was removed could be a work around. The downside of this is that you will have to remove bluetooth_proxy
and return with upload_tft
every time you need to transfer a TFT, then revert it back, but as you shouldn't be transferring TFT files every day, that could be a way to go.
While I also liked the idea of one device less, I am perfectly fine keeping another ESP32 as a dedicated BT-proxy. In particular, people (including me ;) will ask for more and more features in this repository, so the 25% memory hog bluetooth_proxy
will not fit at some point anyway. I would not mind, If the proxy is moved to 'unsupported' and a warning paragraph is put in the docs.
That being said, my curiosity is triggered, why the ESP32 development board would not even fail esp-idf
, climate_dual
, bluetooth_proxy
and web_server
at the same time. Is there some Nextion 'overhead'?
That being said, my curiosity is triggered, why the ESP32 development board would not even fail esp-idf, climate_dual, bluetooth_proxy and web_server at the same time. Is there some Nextion 'overhead'?
I have no idea. 😞
Sure, its absolutely true that separating BT proxy to a dedicated device is better, but I had no issues till 4.2.2.
I've compiled with v4.2.2 and got this:
RAM: [== ] 17.6% (used 57604 bytes from 327680 bytes)
Flash: [========= ] 87.5% (used 1604725 bytes from 1835008 bytes)
I haven't flashed yet, but it is in the limit between the ones working and the ones failing in the table above.
On my test with v4.2.2 it got bricked.
I believe this is on the limit anyways. Maybe you have being using a couple of bytes below the limit, so it worked.
Anyways, we are too close to that limit. I probably can try to find some way to save a few bytes here and there, but as soon we do something new, it will break again. I don't have to impose this limit to our development and instead I would keep BT as a customization that isn't fully supported, as from the beginning. We have a work around anyways, and we can try to have more of new features in separated packages, so one could always remove what they don't use.
I might have found a starting point to explain the behavior. OTA updates allow only half the size of the flash, i.e. 4MB/2 = 2MB in our case (The NSPanel has 4MB flash). This seems to be a safety net to never end up with an unbootable device, even in case of a power outage. Also, the arduino
and esp-idf
framework have different partition tables, which is why we should use the serial flash after switching between the two. Also, there is some overhead and less than 2MB are available.
Now, the interesting part: It is possible to successfully compile with esphome, but the size does not match the respective partition table and the compile and flash will seem to succeed, but the device will not boot. Although the info is 2y old, it would explain what @edwardtfn found, because 1638400/1835008 from the link equals 89%. If someone would know how to look up the exact partition table for esp-idf
and arduino
flashes in esphome, maybe it would be a perfect match. I quote from the last link
it will only cause issues in rare cases where you're just between the arduino size limit and the esp idf limit. As described on discord, the fw will work just fine even with a different partition table.
Unfortunately, it is not possible to manually intervene and to give
esp32:
framework:
type: esp-idf
flash_size: "3.6MB"
to indicate the limit. It might also explain slight differences for the dev board and the nspanel and/or the different behavior when using different flashing tools?!
Just here to drop in my 2 cents. Updating from 4.2.2 to 4.2.4.
I'm running a pretty standard install. My custom options are:
switch:
- id: !extend relay_1
restore_mode: ALWAYS_ON
- id: !extend relay_2
restore_mode: ALWAYS_OFF
bluetooth_proxy:
active: true
cache_services: true
esp32_ble_tracker:
wifi:
power_save_mode: LIGHT
When trying to update TFT I received the following error:
[01:50:07][D][esp-idf:000]: E (653326) esp-tls-mbedtls: mbedtls_ssl_setup returned -0x7F00
[01:50:07][D][esp-idf:000]: E (653329) esp-tls: create_ssl_handle failed
[01:50:07][D][esp-idf:000]: E (653331) esp-tls: Failed to open new connection
[01:50:07][D][esp-idf:000]: E (653333) TRANSPORT_BASE: Failed to open a new connection
[01:50:07][D][esp-idf:000]: E (653339) HTTP_CLIENT: Connection failed, sock < 0
[01:50:07][E][nextion.upload.idf:174]: HTTP request failed: ESP_ERR_HTTP_CONNECT
Presumably ESP-IDF running out of memory and failing to create the relevant HTTPS connection.
I then tried flashing locally via HTTP (insecure) from my local instance. It died at 94.8% every single time.
I thought it might be a TFT issue so I flashed NSPanel_Blank which crashed immediately.
During all this ESPHome side was still responsive and functional.
I removed the following config:
bluetooth_proxy:
active: true
cache_services: true
esp32_ble_tracker:
Now flashing US TFT via HTTPS from github works fine (started from 0% rather than 80% or whatever).
Have restored those removed settings and everything boots up normally.
So that confirms the most likely hypothesis that this is a memory issue.
Yest thats was my experience also, I was NOT able to use TFT upload together with BT proxy, but at least BT proxy worked.
Now, the interesting part: It is possible to successfully compile with esphome, but the size does not match the respective partition table and the compile and flash will seem to succeed, but the device will not boot. Although the info is 2y old, it would explain what @edwardtfn found, because 1638400/1835008 from the link equals 89%. If someone would know how to look up the exact partition table for
esp-idf
andarduino
flashes in esphome, maybe it would be a perfect match. I quote from the last link
Here's my stats of my build with all the bells and whistles enabled:
Linking .pioenvs/mbr-nspanel/firmware.elf
RAM: [== ] 17.6% (used 57588 bytes from 327680 bytes)
Flash: [========= ] 91.6% (used 1681641 bytes from 1835008 bytes)
Building .pioenvs/mbr-nspanel/firmware.bin
Can confirm it boots and functions properly without any issues.
Edit: by all the bells and whistles I mean with my customisations above and using ESP-IDF
So that confirms the most likely hypothesis that this is a memory issue.
I would be 100% on that base on your settings. ESP32 shares the same radio between BT and WiFi, so that could be an issue when trying to transfer TFT while BT is enabled. About memory, we are fetching a 4kb chunk of data from the http server, disconnecting, transferring that to Nextion, cleaning the memory, then repeating the process for the following 4kb. It is a bit different with Arduino, which is using bigger chunks and permanent connections, but that was causing some issues and we decided to work with the short live connections on esp-idf due to that. We might have a memory leak on the TFT transfer. Maybe improving logs will give us better info... But the thing with BT bricking while not transferring is something else. I'm not saying it's not related to memory, it probably is, but not necessarily an issue with TFT will conclude is memory.
In the end, it's the amount of code the thing making the big differences. And now we know we are oijuted to something not far from 80% of the available memory informed by ESPHome compiler. 😩
@illuzn I cannot follow which is which: 91.6% Flash is with or without proxy.
It seems that there are at least 3 (common) options for ota partitions: 0x1c0000 (which esphome uses to calculate the %), 0x1b0000 and 0x190000, which I suspect to be the limit. Maybe it is only wishful thinking, but changing the partition table to host 0x1c0000 bytes for each ota partition would be great...
esphome changed the assumed size in this commit, which is why old screenshots of esphome compiles show 1638400 (0x190000) bytes to calculate the %. I found (against my first claim) that this can be adjusted in esphome
partitions (Optional, filename): The name of (optionally including the path to) the file containing the partitioning scheme to be used. When not specified, partitions are automatically generated based on flash_size.
So, the correct file corresponding to the NSpanel partition table would show an error after the compile and (hopefully) not allow the flash that would soft brick the device.
Disclaimer: Do not upload the compiled file unless you are trusting me more than I do!
I was indeed able to edit a partition table, feed it to esphome that subsequently bases its calculation on it. I used the attached yaml and partitions.csv
files and got (after compilation, manual download)
Compiling .pioenvs/andydevpanel/src/main.o
Linking .pioenvs/andydevpanel/firmware.elf
RAM: [== ] 17.6% (used 57740 bytes from 327680 bytes)
Error: The program size (1712897 bytes) is greater than maximum allowed (1638400 bytes)
Flash: [==========] 104.5% (used 1712897 bytes from 1638400 bytes)
*** [checkprogsize] Explicit exit, status 1
========================= [FAILED] Took 26.75 seconds =========================
A compile error is already much better than an unsuccessful flash and soft-brick. However, maybe getting a hand on the remaining 0x30000 bytes for the project would be even better. I will look into it and try to flash a development esp32 before I dare to flash a panel. I do not know if hard-bricking is possible.
edit: I do not think I is correct what I said. Serial flash also writes the new partition table (my understanding). Messing with it might be harmful and I deleted the reference to what I did.
Nice!
What about the flash size you've shown earlier?
esp32:
framework:
type: esp-idf
flash_size: "3.6MB"
On the docs it says:
flash_size (Optional, string): The amount of flash memory available on the ESP32 board/module. One of 2MB, 4MB, 8MB, 16MB or 32MB. Defaults to 4MB. Warning: specifying a size larger than that available on your board will cause the ESP32 to fail to boot.
Have it worked for you with 3.6MB?
You have to use the csv file whose contents I linked and
I utilized
esp32:
framework:
type: esp-idf
partitions: "partitions.csv"
with an edited partitions.csv
file.
Please do not flash it, yet, unless you are sure I know what I am doing...
Edit: Directly the file for clarity [partitions.csv]
Do not listen to me! Hard bricking is possible. Now I can still flash, but afterwards the device is dead...
edit: deleted the reference to the csv file.
Today, I got a replacement panel and read my previous post before adding more info. However, I realized that it sounds alarming. To clarify: I am now 90% sure that any weird partition table is simply removed via a subsequent serial flash. However, I am 100% sure that connecting a 5V power supply at the 8 pin panel connector while forgetting to remove the 3.3V power supply used for flashing 'hard-bricks' the device. My stupidity fried the nspanel and not any uploading I did! The ESP32 even still works...That being said, I marked (strikethrough) some lines in my previous posts that turned out to be incorrect.
I looked at the partition table schemes for the arduino
and esp-idf
frameworks and attached them. Please note that they both have the same amount of space for the OTA flashing: 1792K, which translates to 1792*1024=1835008=0x1c0000. This is the number esphome uses right now and I would say this is correct in either case.
1) It would be nice to have the compile show an error message instead of a soft-bricked device that requires a serial flash. I did not succeed to find a solution. (a) My solution promoted above with a modified partition table (supposedly) works as long as only OTA flashing is utilized: Flash: [==========] 104.5% (used 1712897 bytes from 1638400 bytes)
But during a serial flash, the partition table would be transferred onto the nspanel. That seems like a rather inconsistent hack. (b) I found another way to limit the binary size
esphome:
name: testcompile
friendly_name: testcompile
platformio_options:
board_upload.maximum_size: 1
but for some reason this is ignored(?!) by esphome: Flash: [==== ] 43.6% (used 799509 bytes from 1835008 bytes)
2) It would be even nicer to find the underlying reason for the soft-bricking. Then, increasing the size from 0x1c0000 to even larger binary sizes should be feasible. The original firmware utilizes 0x1f0000 while still maintaining two OTA partitions, but smaller nvs storage (64K vs 436K). Related: I could never soft-brick my ESP32 dev board even with the largest binaries (<0x1c0000) from the table. Therefore, I tried to write the compiled yaml to the dev board utilizing the original (i.e. Sonoff) partition table, but so far to no avail :(
arduino
Name | Type | SubType | Offset | Size | Flags |
---|---|---|---|---|---|
nvs | data | nvs | 0x9000 | 20K | |
otadata | data | ota | 0xe000 | 8K | |
app0 | app | ota_0 | 0x10000 | 1792K | |
app1 | app | ota_1 | 0x1d0000 | 1792K | |
eeprom | data | 153 | 0x390000 | 4K | |
spiffs | data | spiffs | 0x391000 | 60K |
esp-idf
Name | Type | SubType | Offset | Size | Flags |
---|---|---|---|---|---|
otadata | data | ota | 0x9000 | 8K | |
phy_init | data | phy | 0xb000 | 4K | |
app0 | app | ota_0 | 0x10000 | 1792K | |
app1 | app | ota_1 | 0x1d0000 | 1792K | |
nvs | data | nvs | 0x390000 | 436K |
Okay this might sound crazy but... this has now happened to me, even though I could flash no BT rom find then flash BT rom after.
So only 2 things have changed in the interim:
I do not think it is crazy at all. I disregarded some of my hypotheses, because nobody did seem to have your (1), yet. Imagine something writing over their supposed chunk of memory into a partition of something else. It could corrupt the app0 partition, but not the app1 partition, because it always writes at the same, but incorrect position in flash memory. Since the OTA flashes alternate between app0 and app1,
The OTA operation functions write a new app firmware image to whichever OTA app slot that is currently not selected for booting. Once the image is verified, the OTA Data partition is updated to specify that this image should be used for the next boot.
it would be really hard to track.
@illuzn If you are really motivated, you could try the following:
Test a)
Test b)
Either (a) or (b) might always work, even when you keep alternating, and the other will always fail at the same step number, i.e. a2 or b3.
Sorry if the following is a random stream of my thoughts... I kind wrote it as I was experimenting.
Using your procedure. Both fail.
Test A: Step 2. Completes successfully but no boot. Aborted at this step. Test B: Step 2. Completes succesfully but no boot. Handily, I had it plugged into my FTDI flasher still and picked up the boot logs. The specific error that is thrown is:
ets Jul 29 2019 12:21:46
rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:6608
load:0x40078000,len:15060
ho 0 tail 12 room 4
load:0x40080400,len:3816
entry 0x40080698
I (29) boot: ESP-IDF 4.4.5 2nd stage bootloader
I (29) boot: compile time 01:22:54
I (29) boot: chip revision: v3.0
I (32) boot.esp32: SPI Speed : 40MHz
I (37) boot.esp32: SPI Mode : DIO
I (41) boot.esp32: SPI Flash Size : 4MB
I (46) boot: Enabling RNG early entropy source...
I (51) boot: Partition Table:
I (55) boot: ## Label Usage Type ST Offset Length
I (62) boot: 0 otadata OTA data 01 00 00009000 00002000
I (69) boot: 1 phy_init RF data 01 01 0000b000 00001000
I (77) boot: 2 app0 OTA app 00 10 00010000 001c0000
I (84) boot: 3 app1 OTA app 00 11 001d0000 001c0000
I (92) boot: 4 nvs WiFi data 01 02 00390000 0006d000
I (99) boot: End of partition table
I (104) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=4489ch (280732) map
I (214) esp_image: segment 1: paddr=000548c4 vaddr=3ffb0000 size=04c64h ( 19556) load
I (222) esp_image: segment 2: paddr=00059530 vaddr=40080000 size=06ae8h ( 27368) load
I (233) esp_image: segment 3: paddr=00060020 vaddr=400d0020 size=13547ch (1266812) map
I (692) esp_image: segment 4: paddr=001954a4 vaddr=40086ae8 size=16a94h ( 92820) load
I (744) boot: Loaded app from partition at offset 0x10000
I (744) boot: Disabling RNG early entropy source...
I (756) cpu_start: Pro cpu up.
I (756) cpu_start: Starting app cpu, entry point is 0x4008249c
I (742) cpu_start: App cpu up.
I (773) cpu_start: Pro cpu start user code
I (773) cpu_start: cpu freq: 160000000
I (773) cpu_start: Application information:
I (777) cpu_start: Project name: mbr-nspanel
I (782) cpu_start: App version: 2023.12.7
I (788) cpu_start: Compile time: Jan 22 2024 10:12:24
I (794) cpu_start: ELF file SHA256: 1e4c0978122d46d1...
I (800) cpu_start: ESP-IDF: 4.4.5
I (804) cpu_start: Min chip rev: v0.0
I (809) cpu_start: Max chip rev: v3.99
I (814) cpu_start: Chip rev: v3.0
assert failed: s_prepare_reserved_regions memory_layout_utils.c:100 (reserved[i + 1].start > reserved[i].start)
Backtrace: 0x40082d26:0x3ffe3390 0x40091815:0x3ffe33b0 0x400979e5:0x3ffe33d0 0x4016d4f2:0x3ffe34f0 0x4016d112:0x3ffe3850 0x4016afcb:0x3ffe3c00 0x40082759:0x3ffe3c40 0x4007959c:0x3ffe3c80 |<-CORRUPTED
I assume that trace is not useful without my firmware file because the offsets will be different for everyone.
When it is successful (i.e. without BT) the RAM is allocated in this way:
I (645) heap_init: Initializing. RAM available for dynamic allocation:
I (652) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (658) heap_init: At 3FFB8270 len 00027D90 (159 KiB): DRAM
I (664) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (671) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (677) heap_init: At 40094458 len 0000BBA8 (46 KiB): IRAM
Now this is where it gets wild. I reinstalled esphome addon in HA just for kicks because that's the only other thing I've changed. Flashing OTA with BT boots!
Here is the startup log - notice the huge difference with the RAM allocation (this was the same every single time I was doing it from my docker container install of esphome).
rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:6608
load:0x40078000,len:15060
ho 0 tail 12 room 4
load:0x40080400,len:3816
entry 0x40080698
I (29) boot: ESP-IDF 4.4.5 2nd stage bootloader
I (29) boot: compile time 01:22:54
I (29) boot: chip revision: v3.0
I (32) boot.esp32: SPI Speed : 40MHz
I (37) boot.esp32: SPI Mode : DIO
I (41) boot.esp32: SPI Flash Size : 4MB
I (46) boot: Enabling RNG early entropy source...
I (51) boot: Partition Table:
I (55) boot: ## Label Usage Type ST Offset Length
I (62) boot: 0 otadata OTA data 01 00 00009000 00002000
I (70) boot: 1 phy_init RF data 01 01 0000b000 00001000
I (77) boot: 2 app0 OTA app 00 10 00010000 001c0000
I (85) boot: 3 app1 OTA app 00 11 001d0000 001c0000
I (92) boot: 4 nvs WiFi data 01 02 00390000 0006d000
I (100) boot: End of partition table
I (104) esp_image: segment 0: paddr=001d0020 vaddr=3f400020 size=4489ch (280732) map
I (214) esp_image: segment 1: paddr=002148c4 vaddr=3ffbdb60 size=04c64h ( 19556) load
I (222) esp_image: segment 2: paddr=00219530 vaddr=40080000 size=06ae8h ( 27368) load
I (233) esp_image: segment 3: paddr=00220020 vaddr=400d0020 size=13547ch (1266812) map
I (692) esp_image: segment 4: paddr=003554a4 vaddr=40086ae8 size=16a94h ( 92820) load
I (745) boot: Loaded app from partition at offset 0x1d0000
I (745) boot: Disabling RNG early entropy source...
I (756) cpu_start: Pro cpu up.
I (757) cpu_start: Starting app cpu, entry point is 0x4008249c
I (0) cpu_start: App cpu up.
I (773) cpu_start: Pro cpu start user code
I (773) cpu_start: cpu freq: 160000000
I (773) cpu_start: Application information:
I (777) cpu_start: Project name: mbr-nspanel
I (783) cpu_start: App version: 2023.12.8
I (788) cpu_start: Compile time: Jan 22 2024 10:49:37
I (794) cpu_start: ELF file SHA256: 5d415d0236b85c1c...
I (800) cpu_start: ESP-IDF: 4.4.5
I (805) cpu_start: Min chip rev: v0.0
I (809) cpu_start: Max chip rev: v3.99
I (814) cpu_start: Chip rev: v3.0
I (819) heap_init: Initializing. RAM available for dynamic allocation:
I (826) heap_init: At 3FFAFF10 len 000000F0 (0 KiB): DRAM
I (832) heap_init: At 3FFB6388 len 00001C78 (7 KiB): DRAM
I (838) heap_init: At 3FFB9A20 len 00004108 (16 KiB): DRAM
I (844) heap_init: At 3FFCBC58 len 000143A8 (80 KiB): DRAM
I (850) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (857) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (863) heap_init: At 4009D57C len 00002A84 (10 KiB): IRAM
I (871) spi_flash: detected chip: generic
I (874) spi_flash: flash io: dio
I (880) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.
I know nothing about c so I can't even begin to diagnose this issue or try to figure out what is going on.
Edit 1: This has been eating away at me and I think I've cracked part of the puzzle. With no BT enabled, there is 338KiB total RAM (added using the hex lens to be precise). With BT enabled, there is 241KiB total RAM. So this explains why I was having issues updating TFT over HTTPS from github. The device has ~100KiB less total RAM to work with.
Edit 2: New discovery, I was running ESPHome Beta 2023.12.8 in my HA instance (did it 2 years ago when ESP-IDF was barely supported) and ESPHome 2023.12.8. Something in the beta changes fixes the bricking problem. Noticed this because my docker container kept trying to update my NSPanel.
Edit 3: Not sure if I'm reading this correctly or not but:
I (857) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
2023.12.8 Beta has RAM starting at 0x3FFE4350 and ending at 0x40000000
I (863) heap_init: At 4009D57C
2023.12.8 Beta then jumps to 0x4009D57C i.e. it skips the range 0x40000001 through 0x4009D57B.
0x40082d26:0x3ffe3390 0x40091815:0x3ffe33b0 0x400979e5:0x3ffe33d0 0x4016d4f2:0x3ffe34f0 0x4016d112:0x3ffe3850 0x4016afcb:0x3ffe3c00 0x40082759:0x3ffe3c40 0x4007959c:0x3ffe3c80 |<-CORRUPTED
These addresses all fall within that skipped range (presumably in use by something else). Hence the corrupted RAM and failure to boot. Looks like an upstream ESPHome issue and nothing to do with this device.
I've had enough of flashing my NSPanel for this month and I don't know c code. But if I had to guess there's an issue with the way its allocating RAM.
Temporarily you can use this to flash (OTA is fine) without TFT transfer but with BT. When you have to transfer TFT, you revert it, transfer then use this again:
substitutions:
###### CHANGE ME START ######
device_name: "nspanelworkroom"
wifi_ssid: !secret wifi_ssid
wifi_password: !secret wifi_password
nextion_update_url: "http://homeassistant.local:8123/local/nspanel_eu.tft"
nextion_blank_url: "http://homeassistant.local:8123/local/nspanel_blank.tft"
##### addon-configuration #####
## addon_climate ##
# addon_climate_heater_relay: "1" # possible values: 1/2
##### CHANGE ME END #####
packages:
remote_package:
url: https://github.com/Blackymas/NSPanel_HA_Blueprint
ref: main
files:
# - nspanel_esphome.yaml # Base package
- advanced/esphome/nspanel_esphome_core.yaml # Core without TFT upload engine
# - advanced/esphome/nspanel_esphome_advanced.yaml # activate advanced (legacy) elements - can be useful for troubleshooting
# - nspanel_esphome_addon_climate_cool.yaml # activate for local climate (cooling) control
# - nspanel_esphome_addon_climate_heat.yaml # activate for local climate (heater) control
refresh: 1s
esp32:
framework:
type: esp-idf
##### My customization - Start #####
bluetooth_proxy:
active: true
wifi:
power_save_mode: LIGHT
##### My customization - End #####
Yes, that's basically what I've been doing.
The rabbit hole in this thread is trying to figure out why devices are getting soft-bricked (needing serial flash) - answer something upstream in ESPHome between 2023.12.8 (not working) and 2023.12.8 beta (working).
Out of curiousity does your config work for bluetooth proxy? According to the docs:
The Bluetooth proxy depends on ESP32 Bluetooth Low Energy Tracker Hub so make sure to add that to your configuration.
So it shouldn't work?
Edit 1: Whoa... just saw this on ESPHome:
The first time this component is enabled for an ESP32, the code partition needs to be resized. Please flash the ESP32 via USB when adding this to your configuration. After that, you can use OTA updates again.
Is this the issue we've been running into all along (nothing to do with RAM allocation or anything as I said above).
Edit 2: Nope... flashing via serial or ESPTool doesn't make a single difference. The docs are just wrong. The only thing that fixes this issue is using 2023.12.8 Beta, which for some reason correctly allocates the RAM (albeit much less so HTTPS will not work).
I'm done! My NSPanel is back on the wall and I'm not touching it again.
@illuzn figured it out! I can confirm his last post. So many red herrings on the way...
TL;DR: esphome-2024.1.0.dev0
fixes the issue. Wait for next esphome release...
I installed (in sequence)
esphome 2023.12.5
and it workedesphome 2023.12.8
does notesphome 2023.12.5
workedesphome 2023.12.6
workedesphome 2023.12.7
does notesphome 2024.1.0.dev0
workedNow comes the kicker. Afterwards
esphome 2023.12.7
workedand not only that, everything I tried worked as well.
A possible explanation: I could install the 'normal' esphome releases via my packet manager (brew), but had to install dev
via a local github esphome clone on my computer. That install also updated packages
Successfully installed chardet-5.2.0 esphome-2024.1.0.dev0 esptool-4.7.0 icmplib-3.0.4 platformio-6.1.13 pyelftools-0.30 zeroconf-0.131.0
So, one of the other packages (supposedly) had a fix and subsequent esphome downgrades used these packages. It explains everything I observed, including the weird observation of my development board always working, since it was attached to a second computer with a slightly different esphome version. Here, the rabbit hole stops for me as well, I will suppress the urge to look at the last commit to the other packages.
@edwardtfn Suggest closed. Issue resolved - upstream problem not related to us.
I could never soft-brick my ESP32 dev board even with the largest binaries (<0x1c0000) from the table. Therefore, I tried to write the compiled yaml to the dev board utilizing the original (i.e. Sonoff) partition table, but so far to no avail :(
The OTA with Arduino framework and esp-idf are not the same. None of them can update the bootloader (technically, they could but none do). Yet, the partition table is stored in and required in the bootloader (and it's also required in your application OTA code). As I understand it, when you run Arduino' OTA, it updates some partition's data to tell where to start on next reboot. Yet, this part isn't at the same place when the application starts and use another partition table. It's writing in the otadata somewhere in the middle of the phy
partition and that prevent the app from working when it tries to start BT or WIFI by making use of the corrupted phy partition data, your panel is bricked.
So when you flash something via the serial link, there are using esptool that's erasing the flash and updating the bootloader, so it's more or less working.
Also, I don't know if it's still the case, but there is/was an issue with espressif's gcc's linker that rounded some sections up by 1 bytes, leading to unaligned bss/data sections. It was completely random, usually, cleaning and rebuilding worked correctly. It gave the same issue you're observing, that is heap at a different address, unexplained panic before even entering the main function.
Temporarily you can use this to flash (OTA is fine) without TFT transfer but with BT. When you have to transfer TFT, you revert it, transfer then use this again:
substitutions: ###### CHANGE ME START ###### device_name: "nspanelworkroom" wifi_ssid: !secret wifi_ssid wifi_password: !secret wifi_password nextion_update_url: "http://homeassistant.local:8123/local/nspanel_eu.tft" nextion_blank_url: "http://homeassistant.local:8123/local/nspanel_blank.tft" ##### addon-configuration ##### ## addon_climate ## # addon_climate_heater_relay: "1" # possible values: 1/2 ##### CHANGE ME END ##### packages: remote_package: url: https://github.com/Blackymas/NSPanel_HA_Blueprint ref: main files: # - nspanel_esphome.yaml # Base package - advanced/esphome/nspanel_esphome_core.yaml # Core without TFT upload engine # - advanced/esphome/nspanel_esphome_advanced.yaml # activate advanced (legacy) elements - can be useful for troubleshooting # - nspanel_esphome_addon_climate_cool.yaml # activate for local climate (cooling) control # - nspanel_esphome_addon_climate_heat.yaml # activate for local climate (heater) control refresh: 1s esp32: framework: type: esp-idf ##### My customization - Start ##### bluetooth_proxy: active: true wifi: power_save_mode: LIGHT ##### My customization - End #####
Replacing package nspanel_esphome.yaml with advanced/esphome/nspanel_esphomecore.yaml gives few errors for me. mainly because character '':
Failed config
sensor.nextion: [source <unicode string>:1450]
id: display_mode
name: Display mode
platform: nextion
Must only consist of upper/lowercase characters, numbers and the period '.'. The character '_' cannot be used.
variable_name: display_mode
precision: 0
accuracy_decimals: 0
internal: False
icon: mdi:phone-rotate-portrait
entity_category: diagnostic
text_sensor.nextion: [source <unicode string>:1782]
id: version_tft
name: Version TFT
platform: nextion
Must only consist of upper/lowercase characters, numbers and the period '.'. The character '_' cannot be used.
component_name: tft_version
entity_category: diagnostic
icon: mdi:tag-text-outline
internal: False
update_interval: never
on_value:
- lambda: |-
static const char *const TAG = "text_sensor.version_tft";
ESP_LOGD(TAG, "TFT version: %s", x.c_str());
if (current_page->state == "boot") {
disp1->send_command_printf("tm_esphome.en=0");
page_boot->execute();
timer_reset_all->execute("boot");
}
check_versions->execute();
So, this issue persists with ESPHome 2024.2.0b1 and as this could be an issue in the future anyways when using customizations, I've improved the documentation. I have no plans to reduce functionality to accommodate customizations, but I believe giving more details in docs will make possible for the ones using bluetooth_proxy
.
Replacing package nspanel_esphome.yaml with advanced/esphome/nspanel_esphomecore.yaml gives few errors for me. mainly because character '':
Could you please report this as another bug?
I could duplicate this when using BT and add-on climate simultaneously, and I agree this is most likely related to the memory usage, as that was an issue already with
arduino
even without BT, but when using too much memory.Base version Framework Add-ons Customizations RAM Flash Comments v4.2.5dev
esp-idf
_upload_tft
removed_ - 9.5% 52.9% Working fine v4.2.5devesp-idf
- - 10.2% 61.8% Working fine v4.2.5devesp-idf
-web_server
10.2% 63.6% Working fine v4.2.5devarduino
- - 14.1% 70.0% Working fine v4.2.5arduino
-web_server
14.2% 72.8% Working fine v4.2.5devesp-idf
_upload_tft
removed_bluetooth_proxy
16.9% 79.0% Working fine v4.2.5devesp-idf
climate_dual
_upload_tft
removed_bluetooth_proxy
16.9% 80.7% Working fine v4.2.5devesp-idf
climate_dual
_upload_tft
removed_bluetooth_proxy
web_server
16.9% 83.6% Working fine v4.2.5devesp-idf
-bluetooth_proxy
17.5% 87.6% Bricked v4.2.2esp-idf
-bluetooth_proxy
17.6% 87.1% Bricked v4.2.5devesp-idf
climate_dual
bluetooth_proxy
17.6% 89.3% Bricked v4.2.5devesp-idf
climate_dual
bluetooth_proxy
web_server
17.6% 91.4% Bricked v4.2.5arduino
-bluetooth_proxy
17.9% 110.0% Cannot build - Flash memory exceeded v4.2.5arduino
-bluetooth_proxy
web_server
17.9% 110.9% Cannot build - Flash memory exceeded I've to flash via serial all the devices that got bricked on the testes above, then I will run more tests, but I believe this option whereupload_tft
was removed could be a work around. The downside of this is that you will have to removebluetooth_proxy
and return withupload_tft
every time you need to transfer a TFT, then revert it back, but as you shouldn't be transferring TFT files every day, that could be a way to go.
Is it possible to use latest version combining climate_heat and bluetooth_proxy? I'm forced to flash with cable everytime I try this combination as it "bricks" - even though I drop upload_tft... maybe I need to get rid of web_server as well if it installed by default? Any other suggestions?
This is becoming a cat-and-mouse game. We are trying to remove things to make space for new features, but in the end ESPHome itself is also growing in RAM consumption and makes quite hard this this of developing in the limit of the available memory.
v4.2.6 with the basic package isn't including web_server
and captive_portal
anymore, so there's nothing that comes right to my mind that could be removed without bigger consequences...
I will take a look for some opportunities to save in the code. Maybe remove some global variables and logging could help a bit, but I'm sure you will again reach the limit pretty soon.
Hello, I tried to activate the Bluetooth proxy and was left with a bricked panel (black screen, no response). ESPHome yaml:
substitutions:
###### CHANGE ME START ######
device_name: "nspanel"
wifi_ssid: !secret wifi_ssid
wifi_password: !secret wifi_password
nextion_update_url: "http://192.168.0.100:8123/local/nspanel_us.tft" # URL to local tft File
# nextion_update_url: "https://raw.githubusercontent.com/Blackymas/NSPanel_HA_Blueprint/main/nspanel_us.tft" # URL to Github
# Enable Bluetooth proxy
bluetooth_proxy:
# Set Wi-Fi power save mode to "LIGHT" as required for Bluetooth on ESP32
wifi:
power_save_mode: LIGHT
##### CHANGE ME END #####
##### DO NOT CHANGE ANYTHING! #####
packages:
##### download esphome code from Github
remote_package:
url: https://github.com/Blackymas/NSPanel_HA_Blueprint
ref: main
files: [nspanel_esphome.yaml]
refresh: 300s
##### DO NOT CHANGE ANYTHING! #####
esp32:
framework:
type: esp-idf
The version of everything is the latest. Through the serial link https://web.esphome.io/ flashing does not work. Through esphome-flasher:
Using 'COM4' as serial port.
Connecting....
Detecting chip type... Unsupported detection protocol, switching and trying again...
Connecting...
Detecting chip type... ESP32
Connecting...
Chip Info:
- Chip Family: ESP32
- Chip Model: ESP32-D0WD-V3 (revision 3)
- Number of Cores: 2
- Max CPU Frequency: 240MHz
- Has Bluetooth: YES
- Has Embedded Flash: NO
- Has Factory-Calibrated ADC: YES
- MAC Address: C0:49:EF:D1:E7:44
Uploading stub...
Running stub...
Stub running...
Changing baud rate to 460800
Changed.
- Flash Size: 4MB
Unexpected error: The firmware binary is invalid (magic byte=FF, should be E9)
I tried to flash the panel with a file without Bluetooth proxy and also with a file that I had previously flashed successfully (when I switched from Arduino to IDF)
Any idea how to fix the panel?
The last sentence
Unexpected error: The firmware binary is invalid (magic byte=FF, should be E9)
points towards I wrong/broken firmware file. An chance you used the wrong format (e.g. legacy vs new format) or the wrong file (e.g. firmware.elf
vs firmware.bin
vs firmware-factory.bin
).
That might also be why
Through the serial link https://web.esphome.io/ flashing does not work.
Everything is fine with the bin file. ...But I found the solution!
I used the tool https://espressif.github.io/esptool-js/ I connected to the panel with a speed of 460800 and uploaded the bin file from address 0x0000. I don't know why the OTA update damaged the bootloader earlier. I will no longer try to enable bluetooth proxy.
TFT Version
4.2.4
ESPHome Version
4.2.4
Blueprint Version
4.2.4
Panel Model
NSPanel EU Model
What is the bug?
Panel bricked if BT proxy enabled in config
Steps to Reproduce
If flashed device with BT Proxy enabled, device will bricked. It did not even boot up, not registering on wifi, non of the buttons are working. It has to be re-flash by wire without the BT Proxy to get working again.
Your panel's YAML
ESPHome logs
Home Assistant logs