Closed proddy closed 1 year ago
emsuart_ep32.h: insert #include "soc/uart_struct.h"
NTPSettingsService.h: insert #include <esp_sntp.h>
OneWire_direct_gpio.h: change all rtc_gpio_desc[pin]
to rtc_io_desc[pin]
WebStatusService.cpp: change info.disconnected.reason
to info.prov_fail_reason
WebAutentification.cpp: change md5 calls to
mbedtls_md5_init(&_ctx);
mbedtls_md5_update_ret (&_ctx,data,len);
mbedtls_md5_finish_ret(&_ctx,data);
mbedtls_internal_md5_process( &_ctx ,data);
// mbedtls_md5_starts(&_ctx);
// mbedtls_md5_update(&_ctx, data, len);
// mbedtls_md5_finish(&_ctx, _buf);
(see here) Compiles, but does not connect to wifi ;-( Seems there are some changes in wifi handling.
Edit: With fixed address wifi is connected, removing the WiFi.config(INADDR_NONE..) for DHCP results in connection, but without getting a dhcp address and esp is not reachable.
Uart seems to ignore the register settings and do not detect the breaks, i don't know what triggers the interrupt, but incomming telegrams have arbitrary length starting somewhere in the middle of normal telegrams.
we should lock the arduino core version in the platformio.ini to prevent the builds from failing, and then create a branch with these changes which we can work on for the next major release. Still need to get the damn 3.4 out first!
I've made a branch with first changes, LittleFS, Dallas, etc. It comples, but some things not working as mentioned. Wifi dhcp get the right address, but emsesp is not reachable. With fixed address it works. ETH also works with dhcp. Uart is very strange, it receives data, but the irq seems not to be called on break. I could not find the changes in arduino or idf that can cause it. The idf seems mainly unchanged, using the idf-driver read the fifo only on timeout/bufferfull, break generates a message, but does not read the fifo to the buffer. I have also a changed LittleFS library with compatible names (LittleFS instead of LITTLEFS). Changing to framework 3.5.0 only needs changing the ARDUINO_EVENTS back.
thanks for making the first start. I'll scout the web forums to see if anyone else is experiencing similar issues with the wifi/dhcp and also some of the examples. It may be just the sequence it's initiated. As for the uart that is going to take some more work. I'm wondering if we can now use C++19 instead of 17 which would offer some further code optimizations.
I've changed the uart to idf-driver, for me it's working now. But my boiler accepts nearly any timing and all tx-modes. I have not checked the timing with logic analyser. Please check on your boiler. The logic to make it work is bad, i have to set fifo-full to one byte to read every incoming byte with irq, which copys to transfer buffer and generates the event, this is readout by event-task and copy to telegram buffer. A lot of calls/copys for a single telegram to receive. The driver-rx-buffer with 256 bytes seems to large for single byte receive, but the driver crashes with smaller buffer size. The ems-tx-mode checks now for a new queue-entry, generated from interrupt after receving a byte.
Funny side effect: wifi dhcp works also now without any change in wifi code. But the wifi issue affects also other people, see here.
I'll check this weekend, I've been out on business these last 2 weeks. I did notice a new core version which may resolve some of the wifi issues https://github.com/espressif/arduino-esp32/releases/tag/2.0.2
I've tested all with E32, because the ETH connection works with new arduino core, but i have no ethernet near boiler and need to have wifi to check uart. The E32 now also have stable wifi, but on a MH-ET i can not connect Wifi AP and STA, it's fluctuating and disconnects after a few seconds. Same software as on E32! I'm not sure the Arduino 2.0.2 is in platformio, but platform=develop have the same issue.
I've seen that here a different platform from tasmota is used. I have to change the OneWire as mentioned here. This OneWire works on all platforms. The tasmota platform gives much smaller filesize (~350kB less), but crashes on boot.
rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:184
load:0x40078000,len:12596
load:0x40080400,len:2916
entry 0x400805c4
┌──────────────────────────────────────┐
│ EMS-ESP version 3.4.0b15rc1 │
│ https://github.com/emsesp/EMS-ESP32 │
│ │
│ type help to show available commands │
└──────────────────────────────────────┘
ems-esp:$ Guru Meditation Error: Core 1 panic'ed (LoadProhibited). Exception was unhandled.
Core 1 register dump:
PC : 0x4012eafd PS : 0x00060533 A0 : 0x8010410b A1 : 0x3ffb26e0
A2 : 0xffffffff A3 : 0xffffff7e A4 : 0x00000000 A5 : 0x3ffbdc6c
A6 : 0x00000020 A7 : 0x00000000 A8 : 0x00000000 A9 : 0x3ffb26a0
A10 : 0x3ffbdc70 A11 : 0x3ffc5c20 A12 : 0x3ffc5c24 A13 : 0x0000abab
A14 : 0x00060523 A15 : 0x00060520 SAR : 0x00000013 EXCCAUSE: 0x0000001c
EXCVADDR: 0x000000b0 LBEG : 0x4008a084 LEND : 0x4008a09a LCOUNT : 0xffffffff
Backtrace:0x4012eafa:0x3ffb26e00x40104108:0x3ffb2700 0x400f6dcc:0x3ffb2720 0x40101f3a:0x3ffb2760 0x400f823a:0x3ffb27a0 0x400fa0eb:0x3ffb2800 0x4012c6c6:0x3ffb2820
ELF file SHA256: 0000000000000000
Rebooting...
But now with actual develop platform the MH-ET boots and connects with dhcp or static address.
I just tried your latest branch with WiFi and haven't seen any issues yet. What would you like me to test/ try out? I have both ETH and WiFi here
Yes wifi also works for me with development platform. ETH was always working. Is the uart in EMS mode working for you? The EMS+ and HT3 have fixed timings, but EMS reads back the master echo, and this is now a bit different. On my boiler all modes and timings are working and i prefere the hardware-mode.
looks ok, 43 minutes and only 6 failed Rx, using TxMode EMS with ETH. I would need to compare against the previous 3.4 but it looks solid enough.
Good, the few more rx-fails are by design, the old uart ignores first telegrams after start and telegrams not ending with break(zero). For this test i wanted to filter less to see what's coming in. We can add those filters again if we want to reduce rx-fail counts to bad-crc.
I'm getting restarts every 1-3hrs though. Need's some more debugging...I'll leave it running and try to catch the reason code
I dont see restarts (11h uptime), but uart buffer is not checked for overflow, I'll update.
My MH-ET shows ~25k less free heap, a memory leak? But it seems to be stable. I'll check ETH, i think the heap is in same range as before.
I've updated the uart and merged your latest dev.
With this and the dev i checked the free heap on different esp32: Filesize with new idf is lower, for E32 (wifi connected) free heap increases a bit, but MH-ET/S32 has less free heap. The difference MH-ET to S32 is due to OTA was disabled on S32 (seen it later). (heap from web-system-page after all ems entities are detected). I think this is due to changes in the framework and nothing to worry.
Framework 3.5: Filesize: 1758 kB MH-ET: Heap: 190148 / 113792 (standalone, ems/mqtt not connected) MH-ET: Heap: 174620 / 108114 (ems/mqtt connected) E32: Heap: 135012 / 70864 (ems connected, wifi) S32: Heap: 195872 / 113792 (standalone, ems/mqtt not connected)
Framework 4.4: Filesize: 1723 kB MH-ET: Heap: 172148 / 110580 (standalone, ems/mqtt not connected) MH-ET: Heap: 155536 / 102388 (ems/mqtt connected) E32: Heap: 139712 / 90100 (ems connected, wifi) S32: Heap: 176556 / 110580 (standalone, ems/mqtt not connected)
I'm running your dev build now and will report back in a few hours.
Just had the first restart after 7hrs
Sad, any usefull reset reason information? I have uptime 9h for MH-ET and 10h for E32, i'll switch the E32 to tx-mode 1-now and leave the other on tx-mode 4.
no error, just "Last system reset reason Core0: Software reset CPU, Core1: Software reset CPU". Free Mem is constant around 170K and not falling. This is with TxMode 1
My E32 is now uptime 30h, 10h with tx-mode 4, 20h with tx-mode 1, no tx-errors (~19.000 reads), 19 rx-fails within ~194.000 receives (0,01%). I can not reproduce the reboots. (btw: SDK shown as v4.4-beta1-189-ga79dc75f0a, do you have the same?)
SDK is the same. I'm running it again, if it crashes I'll start turning off the services (NTP, MQTT, AP)
I think it's the uart buffer. yesterday i've added a bufffer-check, but forgot to readout rx-buffer, so after an overflow the uart only throws garbage. This happend after ~3-5 h. Try again with actual code.
testing now...
looking good, no glitches in 6hrs...
it's been running now for 24hrs without any crashes. Rx 82825/26 fail and Tx Read is 19138/16. Which is good enough. I should compare against the 3.4b to see if those Tx Read failures are normal
I've updated the idf4-branch to latest dev and changed uart code for rx and tx-mode 1. I have a bit less rx-fails and no tx-fails. Please check.
I've updated the idf4-branch to latest dev and changed uart code for rx and tx-mode 1. I have a bit less rx-fails and no tx-fails. Please check.
impressive. been running for 1hr+ with 0 fails
I've updated the idf4-branch to latest dev and changed uart code for rx and tx-mode 1. I have a bit less rx-fails and no tx-fails. Please check.
impressive. been running for 1hr+ with 0 fails
after 20hrs only 14 failed Tx and 12 failed Tx Reads. At 100% quality for both. The Rx fails is about half it was in the previous idf4 dev release. So all good.
Do you have logged the tx fails?
I'll do some tracing over the weekend to see why the Tx errors are high. On v3.4 I'm getting 0 at the moment:
react18 was upgraded in 3.4b18
lets get 3.4.1 out with the latest fixes and make 3.4.2 based on espressif arduino v2
ok.
I'll do some tracing over the weekend to see why the Tx errors are high. On v3.4 I'm getting 0 at the moment:
With the latest 3.4.2b I'm still seeing UART errors on both Rx/Tx. With the previous espressif 3.5 which I had running for 12 days I had zero fails. I'll do some tracing and debugging.
Have you checked what tx errors this are? Is it random or is there any time or telegram sytematic? I can not reproduce, my ems-master in not timing critical and any tx-mode works (i mostly use tx-mode 4). I'm curious about feedback from ems+ and ht3 users. BTW: I've added a syslog count/fail, is this usefull, should i add this to dev?
it's hard to find the Tx errors, without adding some extra debug code. They happen randomly every few hours and difficult to reproduce and capture without flooding the logs with raw telegrams.
syslog is good to add, although it'll show a lot of messages depending on the Level
You should see the tx telegram (to_string) in error-log-level with log-time: https://github.com/emsesp/EMS-ESP32/blob/794b3c04712ccdb0e278e4118670557fb5edda42/src/telegram.cpp#L596-L599
I'm more concerned about the Rx fails, it's one every 50mins. Thats on txmode=1. I'll try 4 (hardware) now
I'm more concerned about the Rx fails, it's one every 50mins. Thats on txmode=1. I'll try 4 (hardware) now
No Tx errors with TxMode 4 (Hardware) after 1d16h on latest dev build. Rx has 22 fails from 134,883 which isn't bad. Still not as solid as 3.4.1 but close.
all done. works fine
The core ESP32 arduino framework has been upgraded to v2.0.0 and PlatformIO will upgrade automatically to this unless we force it to stay on 3.5. Which is fine, but breaks a few things. At first glance, we need to modify the UART code and replace our LittleFS library with the core's version. While we're at it we could look at upgrading to the brand new NodeJS 18.0 and also migrating to ReactJS 18.