arendst / Tasmota

Alternative firmware for ESP8266 and ESP32 based devices with easy configuration using webUI, OTA updates, automation using timers or rules, expandability and entirely local control over MQTT, HTTP, Serial or KNX. Full documentation at
https://tasmota.github.io/docs
GNU General Public License v3.0
22.04k stars 4.78k forks source link

Influx DB always tries to use an IPv6 address if the influx server is set as a hostname using v12.4.0 on ESP32 #18015

Closed therobveiller closed 1 year ago

therobveiller commented 1 year ago

PROBLEM DESCRIPTION

A clear and concise description of what the problem is. Influx DB always yields an IPv6 address (after about 30seconds) if the influx server is set as a hostname using v12.4.0 on ESP32, even though setoption149 is set to 0 (default). If -DUSE_IPV6 is commented out in platformio_tasmota32.ini, then it yields an IPv4 address, but takes 15seconds to do so (can be seen in delay before next log entry generated, and the device hangs and is unresponsive during this delay). In v12.3.1 the name-to-ip to yeild an IPv4 is almost instantaneous.

REQUESTED INFORMATION

Make sure your have performed every step and checked the applicable boxes before submitting your issue. Thank you!

- [#] If using rules, provide the output of this command: `Backlog Rule1; Rule2; Rule3`:
```lua
  Rules output here:
No rules set
- [#] Set `weblog` to 4 and then, when you experience your issue, provide the output of the Console log:
```lua
  Console output here:
**_Build 12.4.0 with IPv6 enabled_**
[12:50:58]00:00:00.002 HDW: ESP32-D0WDQ6 
[12:50:58]00:00:00.053 UFS: FlashFS mounted with 304 kB free
[12:50:58]00:00:00.083 CFG: Loaded from File, Count 10
[12:50:58]00:00:00.086 CFG: CR 523/699, Busy 0
[12:50:58]00:00:00.098 ROT: Mode 1
[12:50:58]00:00:00.108 BRY: GC from 3819 to 3049 bytes, objects freed 5/20 (in 1 ms) - slots from 35/61 to 23/61
[12:50:59]00:00:00.216 CFG: No '*.autoconf' file found
[12:50:59]00:00:00.221 BRY: GC from 4715 to 4041 bytes, objects freed 5/42 (in 1 ms) - slots from 59/76 to 41/76
[12:50:59]00:00:00.221 BRY: Berry initialized, RAM used=4041 bytes
[12:50:59]00:00:00.244 BRY: No 'preinit.be'
[12:50:59]00:00:00.250 TFS: File '.drvset003' not found
[12:50:59]00:00:00.250 CFG: Energy use defaults as file system not ready or file not found
[12:50:59]00:00:00.254 Project tasmota - Tasmota Version 12.4.0(I2C)-2_0_6(2023-02-19T12:40:44)
[12:50:59]00:00:00.266 ETH: No ETH MDC and/or ETH MDIO GPIO defined
[12:50:59]00:00:00.278 SHT: Sensor did not ACK command
[12:50:59]00:00:00.390 BRY: No 'autoexec.be'
[12:50:59]00:00:00.563 WIF: Checking connection...
[12:50:59]00:00:00.564 WIF: Attempting connection...
[12:50:59]00:00:00.677 WIF: Network (re)scan started...
[12:51:00]00:00:01.473 WIF: Checking connection...
[12:51:00]00:00:01.474 WIF: Attempting connection...
[12:51:01]00:00:02.479 WIF: Checking connection...
[12:51:01]00:00:02.480 WIF: Attempting connection...
[12:51:01]00:00:02.481 WIF: Network 0, AP1, SSId sensory, Channel 6, BSSId 28:EE:52:9B:E9:58, RSSI -55, Encryption 1
[12:51:01]00:00:02.492 WIF: Network 1, AP-, SSId yarp, Channel 6, BSSId 2E:EE:52:9B:E9:58, RSSI -55, Encryption 1
[12:51:01]00:00:02.858 WIF: Connecting to AP1 sensory Channel 6 BSSId 28:EE:52:9B:E9:58 in mode 11n as tasmota-1726E0-1760...
[12:51:02]00:00:03.302 WIF: IPv4 172.28.31.3, mask 255.255.0.0, gateway 172.28.1.254
[12:51:04]00:00:05.376 WIF: Checking connection...
[12:51:04]00:00:05.377 WIF: Connected
[12:51:04]00:00:05.589 HTP: Web server active on tasmota-1726E0-1760 with IP address 172.28.31.3
[12:51:04]00:00:05.982 NTP: Sync time...
[12:51:04]00:00:06.002 WIF: DNS resolved 'ntp.home' (172.28.3.1) in 18 ms
[12:51:04]00:00:06.018 RTC: UTC 2023-02-19T12:51:08, DST 2023-03-26T01:00:00, STD 2023-10-29T02:00:00
[12:51:04]12:51:08.000 RTC: Synced by NTP
[12:51:04]12:51:08.004 IFX: Url http://influx.home:8086/api/v2/write?org=home&bucket=iot-data
[12:51:04]12:51:08.016 IFX: Validating connection to http://influx.home:8086/health
[12:51:05]12:51:08.461 WIF: IPv6 Local fe80::3e61:5ff:fe17:26e0
[12:51:34]12:51:38.060 IFX: Error connection refused
[12:51:34]12:51:38.069 HTP: Main Menu
[12:51:35]12:51:38.699 MQT: Attempting connection...
[12:51:35]12:51:38.712 WIF: DNS resolved 'mqtt.home' (172.28.1.1) in 12 ms
[12:51:36]12:51:39.161 MQT: TLS connected in 447 ms, stack low mark 1868
[12:51:36]12:51:39.162 MQT: Server fingerprint: 1D D7 39 A7 85 3A 31 92 3D 20 D9 F8 3F 5D CB F1 20 68 64 DC
[12:51:36]12:51:39.173 MQT: Connected

**_Build 12.4.0 with IPv6 disabled (-DUSE_IPV6 is commented out in platformio_tasmota32.ini)_**
[12:58:19]00:00:00.002 HDW: ESP32-D0WDQ6 
[12:58:19]00:00:00.026 UFS: FlashFS mounted with 304 kB free
[12:58:19]00:00:00.043 CFG: Loaded from File, Count 12
[12:58:19]00:00:00.046 CFG: CR 523/699, Busy 0
[12:58:19]00:00:00.060 ROT: Mode 1
[12:58:19]00:00:00.069 BRY: GC from 3819 to 3049 bytes, objects freed 5/20 (in 1 ms) - slots from 35/61 to 23/61
[12:58:19]00:00:00.128 CFG: No '*.autoconf' file found
[12:58:19]00:00:00.131 BRY: GC from 4715 to 4041 bytes, objects freed 5/42 (in 1 ms) - slots from 59/76 to 41/76
[12:58:19]00:00:00.132 BRY: Berry initialized, RAM used=4041 bytes
[12:58:19]00:00:00.152 BRY: No 'preinit.be'
[12:58:19]00:00:00.155 TFS: File '.drvset003' not found
[12:58:19]00:00:00.156 CFG: Energy use defaults as file system not ready or file not found
[12:58:19]00:00:00.160 Project tasmota - Tasmota Version 12.4.0(I2C)-2_0_6(2023-02-18T18:37:02)
[12:58:19]00:00:00.171 ETH: No ETH MDC and/or ETH MDIO GPIO defined
[12:58:19]00:00:00.184 SHT: Sensor did not ACK command
[12:58:19]00:00:00.253 BRY: No 'autoexec.be'
[12:58:20]00:00:00.514 WIF: Checking connection...
[12:58:20]00:00:00.515 WIF: Attempting connection...
[12:58:20]00:00:00.628 WIF: Network (re)scan started...
[12:58:21]00:00:01.461 WIF: Checking connection...
[12:58:21]00:00:01.461 WIF: Attempting connection...
[12:58:22]00:00:02.469 WIF: Checking connection...
[12:58:22]00:00:02.469 WIF: Attempting connection...
[12:58:22]00:00:02.470 WIF: Network 0, AP1, SSId sensory, Channel 6, BSSId 28:EE:52:9B:E9:58, RSSI -54, Encryption 1
[12:58:22]00:00:02.481 WIF: Network 1, AP-, SSId yarp, Channel 6, BSSId 2E:EE:52:9B:E9:58, RSSI -54, Encryption 1
[12:58:22]00:00:02.492 WIF: Network 2, AP-, SSId yarp, Channel 11, BSSId 2E:EE:52:E9:4E:1C, RSSI -90, Encryption 1
[12:58:22]00:00:02.503 WIF: Network 3, AP1, SSId sensory, Channel 11, BSSId 28:EE:52:E9:4E:1C, RSSI -93, Encryption 1
[12:58:22]00:00:02.859 WIF: Connecting to AP1 sensory Channel 6 BSSId 28:EE:52:9B:E9:58 in mode 11n as tasmota-1726E0-1760...
[12:58:23]00:00:04.072 WIF: IPv4 172.28.31.3, mask 255.255.0.0, gateway 172.28.1.254
[12:58:25]00:00:05.384 WIF: Checking connection...
[12:58:25]00:00:05.385 WIF: Connected
[12:58:25]00:00:05.596 HTP: Web server active on tasmota-1726E0-1760 with IP address 172.28.31.3
[12:58:25]00:00:05.987 NTP: Sync time...
[12:58:25]00:00:06.012 WIF: DNS resolved 'ntp.home' (172.28.3.1) in 24 ms
[12:58:25]00:00:06.028 RTC: UTC 2023-02-19T12:58:29, DST 2023-03-26T01:00:00, STD 2023-10-29T02:00:00
[12:58:25]12:58:29.000 RTC: Synced by NTP
[12:58:25]12:58:29.004 IFX: Url http://influx.home:8086/api/v2/write?org=home&bucket=iot-data
[12:58:25]12:58:29.016 IFX: Validating connection to http://influx.home:8086/health
[12:58:41]12:58:45.209 HTP: Main Menu
[12:58:42]12:58:45.629 MQT: Attempting connection...
[12:58:42]12:58:45.643 WIF: DNS resolved 'mqtt.home' (172.28.1.1) in 12 ms
[12:58:42]12:58:46.083 MQT: TLS connected in 439 ms, stack low mark 1868
[12:58:42]12:58:46.085 MQT: Server fingerprint: 1D D7 39 A7 85 3A 31 92 3D 20 D9 F8 3F 5D CB F1 20 68 64 DC
[12:58:42]12:58:46.096 MQT: Connected

**_Output from release v12.3.1 (same device, same build environment etc)_**
[13:02:39]00:00:00.002 HDW: ESP32-D0WDQ6 
[13:02:39]00:00:00.060 UFS: FlashFS mounted with 304 kB free
[13:02:39]00:00:00.118 CFG: Loaded from File, Count 12
[13:02:39]00:00:00.121 CFG: CR 523/699, Busy 0
[13:02:39]00:00:00.130 ROT: Mode 1
[13:02:39]00:00:00.139 BRY: GC from 3819 to 3049 bytes, objects freed 5/20 (in 1 ms) - slots from 35/61 to 23/61
[13:02:40]00:00:00.266 CFG: No '*.autoconf' file found
[13:02:40]00:00:00.271 BRY: GC from 4730 to 4041 bytes, objects freed 5/42 (in 0 ms) - slots from 60/76 to 41/76
[13:02:40]00:00:00.272 BRY: Berry initialized, RAM used=4041 bytes
[13:02:40]00:00:00.297 BRY: No 'preinit.be'
[13:02:40]00:00:00.301 Project tasmota - Tasmota Version 12.3.1(I2C)-2_0_5_3(2023-01-23T09:16:44)
[13:02:40]00:00:00.303 ETH: No ETH MDC and/or ETH MDIO GPIO defined
[13:02:40]00:00:00.325 SHT: Sensor did not ACK command
[13:02:40]00:00:00.452 BRY: No 'autoexec.be'
[13:02:40]00:00:00.532 WIF: Checking connection...
[13:02:40]00:00:00.532 WIF: Attempting connection...
[13:02:40]00:00:00.645 WIF: Network (re)scan started...
[13:02:41]00:00:01.402 WIF: Checking connection...
[13:02:41]00:00:01.403 WIF: Attempting connection...
[13:02:42]00:00:02.409 WIF: Checking connection...
[13:02:42]00:00:02.410 WIF: Attempting connection...
[13:02:42]00:00:02.410 WIF: Network 0, AP1, SSId sensory, Channel 6, BSSId 28:EE:52:9B:E9:58, RSSI -54, Encryption 1
[13:02:42]00:00:02.421 WIF: Network 1, AP-, SSId yarp, Channel 6, BSSId 2E:EE:52:9B:E9:58, RSSI -54, Encryption 1
[13:02:42]00:00:02.432 WIF: Network 2, AP1, SSId sensory, Channel 11, BSSId 28:EE:52:E9:4E:1C, RSSI -91, Encryption 1
[13:02:42]00:00:02.443 WIF: Network 3, AP-, SSId yarp, Channel 11, BSSId 2E:EE:52:E9:4E:1C, RSSI -92, Encryption 1
[13:02:42]00:00:02.798 WIF: Connecting to AP1 sensory Channel 6 BSSId 28:EE:52:9B:E9:58 in mode 11n as tasmota-1726E0-1760...
[13:02:43]00:00:03.317 WIF: IPv4 172.28.31.3, mask 255.255.0.0, gateway 172.28.1.254
[13:02:45]00:00:05.321 WIF: Checking connection...
[13:02:45]00:00:05.322 WIF: Connected
[13:02:45]00:00:05.532 HTP: Web server active on tasmota-1726E0-1760 with IP address 172.28.31.3
[13:02:45]00:00:05.926 NTP: Sync time...
[13:02:45]00:00:05.953 WIF: Resolving 'ntp.home' (172.28.3.1)
[13:02:45]00:00:05.978 RTC: UTC 2023-02-19T13:02:49, DST 2023-03-26T01:00:00, STD 2023-10-29T02:00:00
[13:02:45]13:02:49.000 RTC: Synced by NTP
[13:02:45]13:02:49.004 IFX: Url http://influx.home:8086/api/v2/write?org=home&bucket=iot-data
[13:02:45]13:02:49.016 IFX: Validating connection to http://influx.home:8086/health
[13:02:46]13:02:49.561 MQT: Attempting connection...
[13:02:46]13:02:49.589 WIF: Resolving 'mqtt.home' (172.28.1.1)
[13:02:46]13:02:50.030 MQT: TLS connected in 439 ms, stack low mark 2004
[13:02:46]13:02:50.032 MQT: Server fingerprint: 1D D7 39 A7 85 3A 31 92 3D 20 D9 F8 3F 5D CB F1 20 68 64 DC
[13:02:46]13:02:50.043 MQT: Connected

TO REPRODUCE

Steps to reproduce the behavior:

EXPECTED BEHAVIOUR

A clear and concise description of what you expected to happen.

SCREENSHOTS

If applicable, add screenshots to help explain your problem.

ADDITIONAL CONTEXT

Add any other context about the problem here. The IPv4 address is correct as resolved from my local DNS server for all entries (mqtt.home, ntp.home, and influx.home). The IPv6 address that 12.4.0 produces is "fictional" and does not come from my DNS server, and it is NOT the correct address for the influx server either. This also causes the device to reboot itself often. If I add the correct IPv6 address for influx.home to my local DNS, the v12.4.0 with IPv6 does connect to the server, but it still takes it 15seconds before moving on to the mqtt section.

(Please, remember to close the issue when the problem has been addressed)

barbudor commented 1 year ago

There have been quite some change around DNS resolution in Tasmota lately, related or not to IPv6

The InfluxDB driver is using Espressif's HttpClient which in turn use WifiClient which resolve the name using WiFi.hostByName( ) and which totally ignore Tasmota's SetOption149

NTP and MQTT are using Tasmota's WifiGetHostByName() which is the one showing the log 'WIF: DNS resolved 'ntp.home' (172.28.3.1) in 24 ms`

An easy fix would be to do the DNS resolution from the InfluxDB to the IP using Tasmota's WifiGetHostByName() at the driver init once for all. Which means that any further change on the DNS entry would be ignored until restart of Tasmota. A little more complex to do it at each InfluxDB request.

Thinking about it ....

barbudor commented 1 year ago

I've implemented the 2nd case, doing the resolv at each infludb publication, htis is what the original code was doing. I don't have the time to test tonight, already late

You can try without any warranty : tasmota32 compiled with USE_INFLUXDB : tasmo-infludb-dns1.zip

Or compile yourself from my branch : https://github.com/barbudor/Tasmota/tree/influxdb-with-new-dns-resolv

Anyhow I'll try to test it tomorrow evening

therobveiller commented 1 year ago

Many thanks barbudor, but that doesn't seem to completely solve the problem. I used your branch to build locally and therefore keep the same test conditions, but it still uses an IPv6 address for the influx hostname when testing "health" even though it now correctly gives an IPv4 for the server address. The device still reboots itself often even with ifx 0, so there is still something under the covers that isn't right here with v12.4.0.x. This rebooting may not be to do with influx however, but it is due to releases after 12.3.1 on ESP32. I will try and investigate that part further if I get time today. At first look it appear to "crash" shortly after an MQTT tele send (see data at the end of this post) I will open another issue for this if necessary?

[09:17:24]00:00:00.002 HDW: ESP32-D0WDQ6 [09:17:24]00:00:00.006 UFS: FlashFS mounted with 304 kB free [09:17:24]00:00:00.012 CFG: Loaded from File, Count 8 [09:17:24]00:00:00.015 CFG: CR 523/699, Busy 0 [09:17:24]00:00:00.029 ROT: Mode 1 [09:17:24]00:00:00.039 BRY: GC from 3819 to 3049 bytes, objects freed 5/20 (in 1 ms) - slots from 35/61 to 23/61 [09:17:24]00:00:00.065 CFG: No '*.autoconf' file found [09:17:24]00:00:00.069 BRY: GC from 4715 to 4041 bytes, objects freed 5/42 (in 1 ms) - slots from 59/76 to 41/76 [09:17:24]00:00:00.070 BRY: Berry initialized, RAM used=4041 bytes [09:17:24]00:00:00.086 BRY: No 'preinit.be' [09:17:24]00:00:00.089 TFS: File '.drvset003' not found [09:17:24]00:00:00.090 CFG: Energy use defaults as file system not ready or file not found [09:17:24]00:00:00.104 Project tasmota - Tasmota Version 12.4.0.1(I2C)-2_0_7(2023-02-22T08:57:21) [09:17:24]00:00:00.105 ETH: No ETH MDC and/or ETH MDIO GPIO defined [09:17:24]00:00:00.128 SHT: Sensor did not ACK command [09:17:24]00:00:00.168 BRY: No 'autoexec.be' [09:17:24]00:00:00.257 WIF: Checking connection... [09:17:24]00:00:00.258 WIF: Attempting connection... [09:17:24]00:00:00.371 WIF: Network (re)scan started... [09:17:25]00:00:01.241 WIF: Checking connection... [09:17:25]00:00:01.242 WIF: Attempting connection... [09:17:26]00:00:02.252 WIF: Checking connection... [09:17:26]00:00:02.253 WIF: Attempting connection... [09:17:26]00:00:02.253 WIF: Network 0, AP1, SSId sensory, Channel 6, BSSId 28:EE:52:9B:E9:58, RSSI -51, Encryption 1 [09:17:26]00:00:02.264 WIF: Network 1, AP-, SSId yarp, Channel 6, BSSId 2E:EE:52:9B:E9:58, RSSI -51, Encryption 1 [09:17:26]00:00:02.275 WIF: Network 2, AP-, SSId yarp, Channel 11, BSSId 2E:EE:52:E9:4E:1C, RSSI -89, Encryption 1 [09:17:26]00:00:02.285 WIF: Network 3, AP1, SSId sensory, Channel 11, BSSId 28:EE:52:E9:4E:1C, RSSI -91, Encryption 1 [09:17:27]00:00:02.651 WIF: Connecting to AP1 sensory Channel 6 BSSId 28:EE:52:9B:E9:58 in mode 11n as tasmota-1726E0-1760... [09:17:28]00:00:03.384 WIF: IPv4 172.28.31.3, mask 255.255.0.0, gateway 172.28.1.254 [09:17:29]00:00:05.330 WIF: Checking connection... [09:17:29]00:00:05.331 WIF: Connected [09:17:30]00:00:05.543 HTP: Web server active on tasmota-1726E0-1760 with IP address 172.28.31.3 [09:17:30]00:00:06.043 NTP: Sync time... [09:17:30]00:00:06.065 WIF: DNS resolved 'ntp.home' (172.28.3.1) in 20 ms [09:17:30]00:00:06.080 RTC: UTC 2023-02-22T09:17:38, DST 2023-03-26T01:00:00, STD 2023-10-29T02:00:00 [09:17:30]09:17:38.001 RTC: Synced by NTP [09:17:30]09:17:38.004 IFX: Url /api/v2/write?org=home&bucket=iot-data [09:17:30]09:17:38.023 WIF: DNS resolved 'influx.home' (172.28.1.1) in 8 ms [09:17:30]09:17:38.024 IFX: Validating connection to http://172.28.1.1:8086/health [09:17:31]09:17:39.244 WIF: IPv6 Local fe80::3e61:5ff:fe17:26e0 [09:17:46]09:17:54.278 HTP: Main Menu [09:17:47]09:17:54.511 MQT: Attempting connection... [09:17:47]09:17:54.534 WIF: DNS resolved 'mqtt.home' (172.28.1.1) in 22 ms [09:17:47]09:17:54.978 MQT: TLS connected in 442 ms, stack low mark 1868 [09:17:47]09:17:54.979 MQT: Server fingerprint: 1D D7 39 A7 85 3A 31 92 3D 20 D9 F8 3F 5D CB F1 20 68 64 DC [09:17:47]09:17:54.989 MQT: Connected

status 0 09:21:36.043 MQT: stat/tasmota_1726E0/STATUS = {"Status":{"Module":0,"DeviceName":"Tasmota","FriendlyName":["Tasmota"],"Topic":"tasmota_1726E0","ButtonTopic":"0","Power":0,"PowerOnState":0,"LedState":5,"LedMask":"FFFF","SaveData":1,"SaveState":0,"SwitchTopic":"0","SwitchMode":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"ButtonRetain":0,"SwitchRetain":0,"SensorRetain":0,"PowerRetain":0,"InfoRetain":0,"StateRetain":0,"StatusRetain":0}} 09:21:36.083 MQT: stat/tasmota_1726E0/STATUS1 = {"StatusPRM":{"Baudrate":115200,"SerialConfig":"8N1","GroupTopic":"tasmotas","OtaUrl":"http://nas1.home/IoT/tasmota32-I2C.bin","RestartReason":"Software reset CPU","Uptime":"0T00:00:32","StartupUTC":"2023-02-22T09:21:04","Sleep":100,"CfgHolder":4617,"BootCount":1,"BCResetTime":"2023-02-22T09:21:02","SaveCount":6}} 09:21:36.123 MQT: stat/tasmota_1726E0/STATUS2 = {"StatusFWR":{"Version":"12.4.0.1(I2C)","BuildDateTime":"2023-02-22T08:57:21","Core":"2_0_7","SDK":"4.4.4.20230218","CpuFrequency":160,"Hardware":"ESP32-D0WDQ6 rev.1","CR":"523/699"}} 09:21:36.142 MQT: stat/tasmota_1726E0/STATUS3 = {"StatusLOG":{"SerialLog":2,"WebLog":2,"MqttLog":0,"SysLog":0,"LogHost":"","LogPort":514,"SSId":["sensory",""],"TelePeriod":300,"Resolution":"558180C0","SetOption":["00208008","2805C80001000600003C5A0A192800000000","000082C8","00206000","10044000","00000000"]}} 09:21:36.180 MQT: stat/tasmota_1726E0/STATUS4 = {"StatusMEM":{"ProgramSize":1479,"Free":1856,"Heap":146,"StackLowMark":1,"PsrMax":0,"PsrFree":0,"ProgramFlashSize":4096,"FlashSize":4096,"FlashChipId":"1640D8","FlashFrequency":40,"FlashMode":"DIO","Features":["00000809","9F8A17CF","0011A001","BFF7BFCF","05DA9BC0","60760C47","480A60D2","20200000","D53C482D","000084B1"],"Drivers":"1,2,3,4,5,8,9,10,11,12,14,16,17,20,21,24,29,34,38,41,50,52,56,59,60,62,63,66,82,86,87,88","Sensors":"1,2,3,5,6,7,8,9,10,11,12,13,14,15,17,18,19,20,21,22,24,26,31,34,37,39,40,42,43,45,51,52,55,56,58,59,60,64,66,67,72,74,85,92,95,98,102,103,127"}} 09:21:36.245 MQT: stat/tasmota_1726E0/STATUS5 = {"StatusNET":{"Hostname":"tasmota-1726E0-1760","IPAddress":"172.28.31.3","Gateway":"172.28.1.254","Subnetmask":"255.255.0.0","DNSServer1":"172.28.1.254","DNSServer2":"212.23.6.100","Mac":"3C:61:05:17:26:E0","IP6Global":"","IP6Local":"fe80::3e61:5ff:fe17:26e0","Ethernet":{"Hostname":"","IPAddress":"0.0.0.0","Gateway":"0.0.0.0","Subnetmask":"0.0.0.0","DNSServer1":"172.28.1.254","DNSServer2":"212.23.6.100","Mac":"00:00:00:00:00:00","IP6Global":"","IP6Local":""},"Webserver":2,"HTTP_API":1,"WifiConfig":2,"WifiPower":17.0}} 09:21:36.294 MQT: stat/tasmota1726E0/STATUS6 = {"StatusMQT":{"MqttHost":"mqtt.home","MqttPort":8883,"MqttClientMask":"DVES%06X","MqttClient":"DVES_1726E0","MqttUser":"A1AXzh4qty","MqttCount":1,"MAX_PACKET_SIZE":1200,"KEEPALIVE":30,"SOCKET_TIMEOUT":4}} 09:21:36.325 MQT: stat/tasmota_1726E0/STATUS7 = {"StatusTIM":{"UTC":"2023-02-22T09:21:36","Local":"2023-02-22T09:21:36","StartDST":"2023-03-26T01:00:00","EndDST":"2023-10-29T02:00:00","Timezone":99,"Sunrise":"07:10","Sunset":"17:39"}} 09:21:36.348 MQT: stat/tasmota_1726E0/STATUS10 = {"StatusSNS":{"Time":"2023-02-22T09:21:36"}} 09:21:36.359 MQT: stat/tasmota_1726E0/STATUS11 = {"StatusSTS":{"Time":"2023-02-22T09:21:36","Uptime":"0T00:00:32","UptimeSec":32,"Heap":141,"SleepMode":"Dynamic","Sleep":100,"LoadAvg":10,"MqttCount":1,"Berry":{"HeapUsed":4,"Objects":48},"Wifi":{"AP":1,"SSId":"sensory","BSSId":"28:EE:52:9B:E9:58","Channel":6,"Mode":"11n","RSSI":100,"Signal":-50,"LinkCount":1,"Downtime":"0T00:00:05"}}}

data at point of crash/reboot [09:39:33]09:39:40.707 MQT: tele/tasmota_1726E0/INFO1 = {"Info1":{"Module":"I2C","Version":"12.4.0.1(I2C)","FallbackTopic":"cmnd/DVES_1726E0_fb/","GroupTopic":"cmnd/tasmotas/"}} [09:39:33]09:39:40.725 MQT: tele/tasmota_1726E0/INFO2 = {"Info2":{"WebServerMode":"Admin","Hostname":"tasmota-1726E0-1760","IPAddress":"172.28.31.3","IP6Global":"","IP6Local":"fe80::3e61:5ff:fe17:26e0"}} [09:39:33]09:39:40.741 MQT: tele/tasmota_1726E0/INFO3 = {"Info3":{"RestartReason":"Software reset CPU","BootCount":1}} [09:39:36]09:39:44.015 MQT: tele/tasmota_1726E0/STATE = {"Time":"2023-02-22T09:39:44","Uptime":"0T00:00:11","UptimeSec":11,"Heap":149,"SleepMode":"Dynamic","Sleep":100,"LoadAvg":24,"MqttCount":1,"Berry":{"HeapUsed":4,"Objects":48},"Wifi":{"AP":1,"SSId":"sensory","BSSId":"28:EE:52:9B:E9:58","Channel":6,"Mode":"11n","RSSI":96,"Signal":-52,"LinkCount":1,"Downtime":"0T00:00:05"}} [09:40:52]09:40:59.627 MQT: stat/tasmota_1726E0/RESULT = {"Reset":"Reset and Restarting"} [09:40:52]09:41:00.031 CFG: Use defaults [09:40:53]09:41:00.442 MQT: stat/tasmota_1726E0/RESULT = {"Module":{"0":"I2C"}} [09:40:53]09:41:00.460 MQT: stat/tasmota_1726E0/RESULT = {"SetOption142":"ON"} [09:40:53]09:41:00.474 MQT: stat/tasmota_1726E0/RESULT = {"IfxSensor":"OFF"} [09:40:53]09:41:00.492 MQT: stat/tasmota_1726E0/RESULT = {"IfxPeriod":600} [09:40:53]09:41:00.506 MQT: stat/tasmota_1726E0/RESULT = {"MqttFingerprint1":"00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00"} [09:40:53]09:41:00.535 MQT: stat/tasmota_1726E0/RESULT = {"Sleep":{"100":{"Active":"100"}}}

therobveiller commented 1 year ago

Having managed to get logging level 4 enabled for long enough to capture the crash properly, the key thing was an entry along the lines of src:button. I found that from somewhere I had a Button 2 set to GPIO35 in my default ESP32 template config. I have no idea where it came from, but the odd thing is it has been there from as long ago as my v10 builds, but has never caused an issue until 12.4.0. Very weird. Anyway, please ignore the rebooting part of my comments above, and the IPv6 during a "health" check appears to be the system showing the linklocal IPv6 address and nothing to do with influx - my mistake in reading the logs. So it looks like your changes may have fixed the issue, but I am about to do some more tests here with frequent influx updates and will report back. Many thanks for your input on this, much appreciated.

barbudor commented 1 year ago

Thanks for the update. I will prepare the PR and we will wait for your confirmation to merge.

Regarding buttons, Theo has done large refactoring on buttons and switches management since v12.3.1 However, while it is not allowed ot have 2x "Button 2", it shouldn't crash. @arendst may be that something you want to investigate

therobveiller commented 1 year ago

Just to clarify, I didn't have 2x Button2, only a button1 (on GPIO 0) & a button2. The button2 was set to GPIO35 which is input-only on an ESP32, and does not have any pull-up or pull-down resistors, so is sensitive to noise. I do not have anything connected to this pin. All I can say is having removed this erroneous button2, the system does not reboot repeatedly. I had used the word crash as that's what it seemed like via the web console, but via serial monitoring I can see that the system appears to be triggering a watchdog? as it says restarting or resetting - I need to capture this to make sure the exact wording. So I don't think @arendst needs to look at any button code for the time being. As for the reported Influx DNS problem, I am seeing approx 15seconds of no-response from the device at every influx data send period. The DNS is resolved, the influx data to be sent is shown in the log, and I then get 15seconds of no-response from the web interface. I am wondering if the button2 causes an event during this no-response period, it is enough to trigger a watchdog timer into rebooting the device. This is only a theory at present as there is no other reason a button2 event should cause a reboot. It may explain why I saw similar problems with my GPS connected to an ESP32, as this too is giving the device an external trigger that could well occur during this non-response blocking period from influx send. I will try and get logs that show clearly the hangs, or at least where in the sequence this happens and get back to you. I should have more time today to do this than I did last night.

therobveiller commented 1 year ago

Having thought about this more and made the influx period an odd number so that it rarely co-incided with mqtt, wifi checking or any other possible timer type event, I was able to capture this from your build.

I am alternating between main menu and information on the web ui, so that is why you see the switching between the two in the logs. I have noted the point at which the UI ceases to respond and the point at which the logs appear for earlier events and the UI starts responding again. I note that when influx is active (ifx 1) I get a berry entry, but I do not get any berry logs when ifx 0 is set. I also tried polling button1 during this blocking period, but it did not cause the device to reboot, so perhaps my button2 theory is wrong. This needs further checking by me.

I am now going to try the same thing, but with a valid IPv6 address for the influx server in my local DNS server to see if there is any residual IPv6 checking going on that shouldn't be.

09:14:24.799 HTP: Main Menu 09:14:24.906 BRY: GC from 4348 to 4316 bytes, objects freed 0/48 (in 0 ms) - slots from 52/122 to 51/122 09:14:24.918 MQT: tele/tasmota_1726E0/STATE = {"Time":"2023-02-23T09:14:24","Uptime":"0T15:39:55","UptimeSec":56395,"Heap":141,"SleepMode":"Dynamic","Sleep":100,"LoadAvg":10,"MqttCount":1,"Berry":{"HeapUsed":4,"Objects":48},"Wifi":{"AP":1,"SSId":"sensory","BSSId":"28:EE:52:9B:E9:58","Channel":6,"Mode":"11n","RSSI":100,"Signal":-48,"LinkCount":1,"Downtime":"0T00:00:05"}} 09:14:25.171 WIF: Checking connection... 09:14:26.989 HTP: Information 09:14:28.216 HTP: Main Menu 09:14:29.336 HTP: Information 09:14:30.661 HTP: Main Menu 09:14:32.683 HTP: Information 09:14:34.215 HTP: Main Menu 09:14:35.535 HTP: Information

The web interface freezes at this point and the entried below do not appear until the blocking ends

09:14:36.857 HTP: Main Menu 09:14:36.967 BRY: GC from 4655 to 4316 bytes, objects freed 1/48 (in 0 ms) - slots from 51/122 to 51/122 09:14:36.972 IFX: Process {"Time":"2023-02-23T09:14:36","Uptime":"0T15:40:07","UptimeSec":56407,"Heap":140,"SleepMode":"Dynamic","Sleep":100,"LoadAvg":14,"MqttCount":1,"Berry":{"HeapUsed":4,"Objects":48},"Wifi":{"AP":1,"SSId":"sensory","BSSId":"28:EE:52:9B:E9:58","Channel":6,"Mode":"11n","RSSI":100,"Signal":-47,"LinkCount":1,"Downtime":"0T00:00:05"}} 09:14:36.989 WIF: DNS resolved 'influx.home' (172.28.1.1) in 8 ms 09:14:36.991 IFX: Sending uptimesec,device=tasmota_1726E0,sensor=device value=56407 heap,device=tasmota_1726E0,sensor=device value=140 sleep,device=tasmota_1726E0,sensor=device value=100 loadavg,device=tasmota_1726E0,sensor=device value=14 mqttcount,device=tasmota_1726E0,sensor=device value=1 heapused,device=tasmota_1726E0,sensor=berry value=4 objects,device=tasmota_1726E0,sensor=berry value=48 ap,device=tasmota_1726E0,sensor=wifi value=1 channel,device=tasmota_1726E0,sensor=wifi value=6 rssi,device=tasmota_1726E0,sensor=wifi value=100 signal,device=tasmota_1726E0,sensor=wifi value=-47 linkcount,device=tasmota_1726E0,sensor=wifi value=1

Logging and web interface become usable again at this point, and it is only at this point that the logs from the freeze point up to here appear in the console

09:14:53.150 HTP: Information 09:15:01.146 WIF: Checking connection... 09:15:21.153 WIF: Checking connection... 09:15:24.178 HTP: Main Menu 09:15:26.906 HTP: Information 09:15:28.233 HTP: Main Menu 09:15:30.876 HTP: Information 09:15:32.110 BRY: GC from 4348 to 4316 bytes, objects freed 0/48 (in 0 ms) - slots from 52/122 to 51/122 09:15:32.115 IFX: Process {"Time":"2023-02-23T09:15:32","Uptime":"0T15:41:03","UptimeSec":56463,"Heap":146,"SleepMode":"Dynamic","Sleep":100,"LoadAvg":15,"MqttCount":1,"Berry":{"HeapUsed":4,"Objects":48},"Wifi":{"AP":1,"SSId":"sensory","BSSId":"28:EE:52:9B:E9:58","Channel":6,"Mode":"11n","RSSI":100,"Signal":-48,"LinkCount":1,"Downtime":"0T00:00:05"}} 09:15:32.140 WIF: DNS resolved 'influx.home' (172.28.1.1) in 18 ms 09:15:32.142 IFX: Sending uptimesec,device=tasmota_1726E0,sensor=device value=56463 heap,device=tasmota_1726E0,sensor=device value=146 sleep,device=tasmota_1726E0,sensor=device value=100 loadavg,device=tasmota_1726E0,sensor=device value=15 mqttcount,device=tasmota_1726E0,sensor=device value=1 heapused,device=tasmota_1726E0,sensor=berry value=4 objects,device=tasmota_1726E0,sensor=berry value=48 ap,device=tasmota_1726E0,sensor=wifi value=1 channel,device=tasmota_1726E0,sensor=wifi value=6 rssi,device=tasmota_1726E0,sensor=wifi value=100 signal,device=tasmota_1726E0,sensor=wifi value=-48 linkcount,device=tasmota_1726E0,sensor=wifi value=1 09:15:48.424 HTP: Main Menu 09:15:57.019 WIF: Checking connection... 09:16:17.044 WIF: Checking connection...

therobveiller commented 1 year ago

Alternating between info and menu quicker, I can see that the web interface freezes perhaps 1-2 seconds before the berry log entry, as once the system unblocks, those web ui entries appear at the correct point in the sequence. So it is something that happens before the berry stuff that does not generate logging info even at level 4.

I can confirm that adding a valid IPv6 to my local DNS makes no difference to the apparent blocking situation.

Important info. I have also made a build of your branch where I disable all IPv6 by commenting out -DUSE_IPV6 in platform_tasmota32.ini as I did with the original master release of 12.4.0. Sadly I have to report that this still gives the long blocking periods with your branch, something that the 12.4.0 master build does not seem to exhibit when all IPv6 was disabled. So you have fixed influx always giving ipv6 address issue, but may have introduced a different issue as a result - sorry, but please don't merge this PR into the main release until I have checked more on this.

Update: 12.4.0 DOES exhibit the blocking even with IPv6 disabled in the build, but it was only obvious when I reduced the ifxperiod to 25, so you have not introduced any new issues, my apologies. However there is something still not right with influx in version 12.4.0.x as this blocking definitely does not occur in v12.3.1, nor does the button2 issue

You can see a similar blocking period of approx 15seconds of apparent inactivity during bootup when influx "health" is checked:-

[09:49:39]00:00:02.597 WIF: Connecting to AP1 sensory Channel 6 BSSId 28:EE:52:9B:E9:58 in mode 11n as tasmota-1726E0-1760... [09:49:42]00:00:05.109 WIF: Connected [09:49:42]00:00:05.320 HTP: Web server active on tasmota-1726E0-1760 with IP address 172.28.31.3 [09:49:43]09:49:47.015 IFX: Validating connection to http://172.28.1.1:8086/health [09:49:59]09:50:03.412 MQT: Attempting connection... [09:50:00]09:50:03.876 MQT: TLS connected in 451 ms, stack low mark 1868 [09:50:00]09:50:03.878 MQT: Connected

I also added back a Button 2, to the template, and the system reboots, but ONLY if influx is enabled and only around the period of influx data send :-

When the reboot occurs it resets pretty much everything back to defaults, so logging goes back to level2, and my backlog script to set ifxperiod to 600 as well as setting my default device template also run. In the previous fault entries, I had this erroneous button2 in my default template, so the problem would reappear every time it reached the influx send, and this was adding to the confusion of "constantly rebooting". Hopefully this can give you more info to go on. I think there may be more than the 1 bug with influx dns resolution that this is revealing. This scenario of adding a button2 to GPIO35 is repeatable for me here.

An FYI, the GPS does NOT have this erroneous Button2 on GPIO35, what I thought was the same rebooting is in fact the very long "blocking" we are seeing in the IPv6 enabled builds. So my theory that it might be any external triggers during the blocking that cause this are evidently wrong. It is very difficult for me to physically access the GPS device, so I can only get console access via the web interface.

If you need any more logs or data please let me know.

[09:59:09]09:59:13.480 HTP: Configuration [09:59:12]09:59:16.351 MQT: tasmota/discovery/3C61051726E0/config = {"ip":"172.28.31.3","dn":"Tasmota","fn":["Tasmota",null,null,null,null,null,null,null],"hn":"tasmota-1726E0-1760","mac":"3C61051726E0","md":"I2C","ty":0,"if":0,"ofln":"Offline","onln":"Online","state":["OFF","ON","TOGGLE","HOLD"],"sw":"12.4.0.1","t":"tasmota_1726E0","ft":"%prefix%/%topic%/","tp":["cmnd","stat","tele"],"rl":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"swc":[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1],"swn":[null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null],"btn":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"so":{"4":0,"11":0,"13":0,"17":0,"20":0,"30":0,"68":0,"73":0,"82":0,"114":0,"117":0},"lk":0,"lt_st":0,"sho":[0,0,0,0],"sht":[[0,0,0],[0,0,0],[0,0,0],[0,0,0]],"ver":1} (retained) [09:59:12]09:59:16.434 MQT: tasmota/discovery/3C61051726E0/sensors = {"sn":{"Time":"2023-02-23T09:59:16"},"ver":1} (retained) [09:59:13]09:59:17.656 HTP: Configure Logging [09:59:21]09:59:24.938 HTP: Configuration [09:59:23]09:59:26.869 WIF: Checking connection... [09:59:29]09:59:33.031 HTP: Main Menu [09:59:43]09:59:46.880 WIF: Checking connection... [09:59:45]09:59:49.002 HTP: Consoles [09:59:46]09:59:50.615 HTP: Console [09:59:52]09:59:55.876 CMD: ifxperiod [09:59:52]09:59:55.877 SRC: WebConsole from 172.28.3.36 [09:59:52]09:59:55.878 CMD: Grp 0, Cmd 'IFXPERIOD', Idx 1, Len 0, Pld -99, Data '' [09:59:52]09:59:55.891 MQT: stat/tasmota_1726E0/RESULT = {"IfxPeriod":600} [10:00:03]10:00:06.803 WIF: Checking connection... [10:00:08]10:00:12.564 CMD: ifxperiod 18 [10:00:08]10:00:12.566 SRC: WebConsole from 172.28.3.36 [10:00:08]10:00:12.566 CMD: Grp 0, Cmd 'IFXPERIOD', Idx 1, Len 2, Pld 18, Data '18' [10:00:08]10:00:12.588 MQT: stat/tasmota_1726E0/RESULT = {"IfxPeriod":18} [10:00:09]10:00:12.785 CFG: Saved, Count 11, Bytes 4096 [10:00:09]10:00:13.309 BRY: GC from 4157 to 3702 bytes, objects freed 7/40 (in 1 ms) - slots from 63/122 to 42/122 [10:00:09]10:00:13.312 IFX: Process {"Time":"2023-02-23T10:00:13","Uptime":"0T00:01:44","UptimeSec":104,"Heap":147,"SleepMode":"Dynamic","Sleep":100,"LoadAvg":22,"MqttCount":1,"Berry":{"HeapUsed":3,"Objects":40},"Wifi":{"AP":1,"SSId":"sensory","BSSId":"28:EE:52:9B:E9:58","Channel":6,"Mode":"11n","RSSI":94,"Signal":-53,"LinkCount":1,"Downtime":"0T00:00:05"}} [10:00:09]10:00:13.357 WIF: DNS resolved 'influx.home' (172.28.1.1) in 10 ms [10:00:09]10:00:13.358 IFX: Sending [10:00:09]uptimesec,device=tasmota_1726E0,sensor=device value=104 [10:00:09]heap,device=tasmota_1726E0,sensor=device value=147 [10:00:09]sleep,device=tasmota_1726E0,sensor=device value=100 [10:00:09]loadavg,device=tasmota_1726E0,sensor=device value=22 [10:00:09]mqttcount,device=tasmota_1726E0,sensor=device value=1 [10:00:09]heapused,device=tasmota_1726E0,sensor=berry value=3 [10:00:09]objects,device=tasmota_1726E0,sensor=berry value=40 [10:00:09]ap,device=tasmota_1726E0,sensor=wifi value=1 [10:00:09]channel,device=tasmota_1726E0,sensor=wifi value=6 [10:00:09]rssi,device=tasmota_1726E0,sensor=wifi value=94 [10:00:09]signal,device=tasmota_1726E0,sensor=wifi value=-53 [10:00:09]linkcount,device=tasmota_1726E0,sensor=wifi value=1 [10:00:39]10:00:42.828 WIF: Checking connection... [10:00:40]10:00:44.339 SRC: Button [10:00:40]10:00:44.341 CMD: Grp 0, Cmd 'RESET', Idx 1, Len 1, Pld 1, Data '1' [10:00:40]10:00:44.348 MQT: stat/tasmota_1726E0/RESULT = {"Reset":"Reset and Restarting"} [10:00:40]10:00:44.547 CFG: Use defaults [10:00:40]10:00:44.548 CFG: CR 136/699, Busy 0 [10:00:40]10:00:44.548 CFG: CR 143/699, Busy 0 [10:00:40]10:00:44.548 CFG: CR 151/699, Busy 0 [10:00:40]10:00:44.559 CFG: CR 159/699, Busy 0 [10:00:40]10:00:44.559 CFG: CR 167/699, Busy 0 [10:00:40]10:00:44.559 CFG: CR 174/699, Busy 0 [10:00:40]10:00:44.560 CFG: CR 212/699, Busy 0 [10:00:40]10:00:44.570 CFG: CR 219/699, Busy 0 [10:00:40]10:00:44.570 CFG: CR 219/699, Busy 0 [10:00:40]10:00:44.571 CFG: CR 282/699, Busy 0 [10:00:40]10:00:44.581 CFG: CR 282/699, Busy 0 [10:00:40]10:00:44.581 CFG: CR 289/699, Busy 0 [10:00:40]10:00:44.582 CFG: CR 289/699, Busy 0 [10:00:40]10:00:44.582 CFG: CR 289/699, Busy 0 [10:00:40]10:00:44.592 CFG: CR 289/699, Busy 0 [10:00:40]10:00:44.593 CFG: CR 299/699, Busy 0 [10:00:40]10:00:44.593 CFG: CR 299/699, Busy 0 [10:00:40]10:00:44.603 CFG: CR 308/699, Busy 0 [10:00:40]10:00:44.604 CFG: CR 317/699, Busy 0 [10:00:40]10:00:44.604 CFG: CR 327/699, Busy 0 [10:00:40]10:00:44.614 CFG: CR 347/699, Busy 0 [10:00:40]10:00:44.615 CFG: CR 359/699, Busy 0 [10:00:40]10:00:44.615 CFG: CR 360/699, Busy 0 [10:00:40]10:00:44.615 CFG: CR 361/699, Busy 0 [10:00:40]10:00:44.626 CFG: CR 369/699, Busy 0 [10:00:40]10:00:44.626 CFG: CR 386/699, Busy 0 [10:00:40]10:00:44.626 CFG: CR 390/699, Busy 0 [10:00:40]10:00:44.637 CFG: CR 394/699, Busy 0 [10:00:40]10:00:44.637 CFG: CR 398/699, Busy 0 [10:00:40]10:00:44.637 CFG: CR 401/699, Busy 0 [10:00:40]10:00:44.648 CFG: CR 403/699, Busy 0 [10:00:40]10:00:44.648 CFG: CR 409/699, Busy 0 [10:00:40]10:00:44.648 CFG: CR 413/699, Busy 0 [10:00:40]10:00:44.649 CFG: CR 421/699, Busy 0 [10:00:40]10:00:44.659 CFG: CR 421/699, Busy 0 [10:00:40]10:00:44.659 CFG: CR 421/699, Busy 0 [10:00:40]10:00:44.660 CFG: CR 421/699, Busy 0 [10:00:40]10:00:44.670 CFG: CR 421/699, Busy 0 [10:00:40]10:00:44.670 CFG: CR 421/699, Busy 0 [10:00:40]10:00:44.672 CFG: CR 412/699, Busy 0 [10:00:40]10:00:44.673 SRC: Restart [10:00:40]10:00:44.674 CMD: Grp 0, Cmd 'BACKLOG', Idx 0, Len 83, Pld -99, Data 'Module 0; SetOption142 1; IfxSensor 0; IfxPeriod 600; MqttFingerprint1 0; sleep 100' [10:00:41]10:00:44.865 CFG: Saved, Count 1, Bytes 4096 [10:00:41]10:00:44.900 SRC: Backlog [10:00:41]10:00:44.901 CMD: Grp 0, Cmd 'MODULE', Idx 1, Len 1, Pld 0, Data '0' [10:00:41]10:00:44.907 MQT: stat/tasmota_1726E0/RESULT = {"Module":{"0":"I2C"}} [10:00:41]10:00:44.943 SRC: Backlog [10:00:41]10:00:44.944 CMD: Grp 0, Cmd 'SETOPTION', Idx 142, Len 1, Pld 1, Data '1' [10:00:41]10:00:44.949 MQT: stat/tasmota_1726E0/RESULT = {"SetOption142":"ON"} [10:00:41]10:00:44.959 SRC: Backlog [10:00:41]10:00:44.960 CMD: Grp 0, Cmd 'IFXSENSOR', Idx 1, Len 1, Pld 0, Data '0' [10:00:41]10:00:44.967 MQT: stat/tasmota_1726E0/RESULT = {"IfxSensor":"OFF"} [10:00:41]10:00:44.977 SRC: Backlog [10:00:41]10:00:44.978 CMD: Grp 0, Cmd 'IFXPERIOD', Idx 1, Len 3, Pld 600, Data '600' [10:00:41]10:00:44.988 MQT: stat/tasmota_1726E0/RESULT = {"IfxPeriod":600} [10:00:41]10:00:44.995 SRC: Backlog [10:00:41]10:00:44.996 CMD: Grp 0, Cmd 'MQTTFINGERPRINT', Idx 1, Len 1, Pld 0, Data '0' [10:00:41]10:00:45.002 MQT: stat/tasmota_1726E0/RESULT = {"MqttFingerprint1":"00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00"} [10:00:41]10:00:45.032 SRC: Backlog [10:00:41]10:00:45.033 CMD: Grp 0, Cmd 'SLEEP', Idx 1, Len 3, Pld 100, Data '100' [10:00:41]10:00:45.039 MQT: stat/tasmota_1726E0/RESULT = {"Sleep":{"100":{"Active":"100"}}} [10:00:42]10:00:45.823 CFG: Saved, Count 2, Bytes 4096 [10:00:42]10:00:46.547 APP: Restarting [10:00:43]ets Jun 8 2016 00:22:57 [10:00:43] [10:00:43]rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT) [10:00:43]configsip: 0, SPIWP:0xee [10:00:43]clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 [10:00:43]mode:DIO, clock div:2 [10:00:43]load:0x3fff0018,len:4 [10:00:43]load:0x3fff001c,len:1044 [10:00:43]load:0x40078000,len:10124 [10:00:43]load:0x40080400,len:5828 [10:00:43]entry 0x400806a8 [10:00:43] [10:00:43]00:00:00.002 HDW: ESP32-D0WDQ6 [10:00:43]00:00:00.052 UFS: FlashFS mounted with 304 kB free [10:00:43]00:00:00.079 CFG: Loaded from File, Count 2 [10:00:44]00:00:00.500 BRY: Berry initialized, RAM used=4707 bytes [10:00:44]00:00:00.517 TFS: File '.drvset003' not found [10:00:44]00:00:00.518 CFG: Energy use defaults as file system not ready or file not found [10:00:44]00:00:00.524 Project tasmota - Tasmota Version 12.4.0.1(I2C)-2_0_7(2023-02-22T17:28:33) [10:00:47]00:00:03.204 WIF: Connecting to AP1 sensory Channel 6 BSSId 28:EE:52:9B:E9:58 in mode 11n as tasmota-1726E0-1760... [10:00:49]00:00:05.712 WIF: Connected [10:00:49]00:00:05.923 HTP: Web server active on tasmota-1726E0-1760 with IP address 172.28.31.3 [10:00:50]10:00:53.920 MQT: Attempting connection... [10:00:51]10:00:54.372 MQT: TLS connected in 441 ms, stack low mark 1868 [10:00:51]10:00:54.373 MQT: Fingerprint learned: 1D D7 39 A7 85 3A 31 92 3D 20 D9 F8 3F 5D CB F1 20 68 64 DC [10:00:51]10:00:54.828 MQT: Connected [10:00:51]10:00:54.836 MQT: tele/tasmota_1726E0/LWT = Online (retained) [10:00:51]10:00:54.838 MQT: cmnd/tasmota_1726E0/POWER = [10:00:51]10:00:54.849 MQT: tele/tasmota_1726E0/INFO1 = {"Info1":{"Module":"I2C","Version":"12.4.0.1(I2C)","FallbackTopic":"cmnd/DVES_1726E0_fb/","GroupTopic":"cmnd/tasmotas/"}} [10:00:51]10:00:54.867 MQT: tele/tasmota_1726E0/INFO2 = {"Info2":{"WebServerMode":"Admin","Hostname":"tasmota-1726E0-1760","IPAddress":"172.28.31.3","IP6Global":"","IP6Local":"fe80::3e61:5ff:fe17:26e0"}} [10:00:51]10:00:54.883 MQT: tele/tasmota_1726E0/INFO3 = {"Info3":{"RestartReason":"Software reset CPU","BootCount":1}} [10:00:54]10:00:57.928 MQT: tele/tasmota_1726E0/STATE = {"Time":"2023-02-23T10:00:57","Uptime":"0T00:00:10","UptimeSec":10,"Heap":149,"SleepMode":"Dynamic","Sleep":100,"LoadAvg":26,"MqttCount":1,"Berry":{"HeapUsed":4,"Objects":48},"Wifi":{"AP":1,"SSId":"sensory","BSSId":"28:EE:52:9B:E9:58","Channel":6,"Mode":"11n","RSSI":92,"Signal":-54,"LinkCount":1,"Downtime":"0T00:00:05"}}

therobveiller commented 1 year ago

I think it's best if I summarise what I believe my findings are thus far... I hope I have all the data correct! There are no devices connected to the test units, and the only actual switch is connected to GPIO 0. No timers are set, or anything else unusual. I2C is enabled on the standard pins for the given device type.

I am still investigating the button2 issue as I am getting some odd findings, and need to verify with the stock build from the website. But to sumarise that briefly I am getting reboots with or without influx enabled, on both ESP32 and ESP8266, and also (never observed before) on release 12.3.1 custom build. I think the switch issue if it exists in the stock build, needs its own issue report, so I will not cloud this influx DNS problem any further with this.

Blocking delay during ifx data send/lookup issue: (ifx set to 1, ifxperiod set to 25, ifhost set to influx.home, setoption149 set to 0, only IPv4 address in local DNS server)

v12.4.0 master release, custom option build (only minor variance from official to give my local system defaults), IPv6 enabled for ESP32

v12.4.0 master release, custom option build (only minor variance from official to give my local system defaults), IPv6 disabled for ESP32

v12.4.0.1 your branch, custom option build (only minor variance from official to give my local system defaults), IPv6 enabled for ESP32

arendst commented 1 year ago

Haven't read anything regarding blocking but your button issue is as designed. What happens is if you define a wrong button type it reports pressed instead of not pressed.

If tasmota detects a button pressed for over 40 seconds it resets the device as requested. See docs. To get rid of this define a correct button type reporting the correct state. So if you defined a Button try a Button_i which inverts the signal when not touched and therefore will not report a pressed button.

See docs about buttons and switches.

Happy hunting for the blocking issue.

therobveiller commented 1 year ago

If tasmota detects a button pressed for over 40 seconds it resets the device as requested. See docs. To get rid of this define a correct button type reporting the correct state. So if you defined a Button try a Button_i which inverts the signal when not touched and therefore will not report a pressed button.

See docs about buttons and switches.

Thank-you @arendst , I was in the process of eliminating any custom bits from a test build to find "the cause" and was chasing my own tail wasting time as the results weren't always consistent. I can go back to doing something more useful now :-)

barbudor commented 1 year ago

Hi @therobveiller

A lot of details in which I'm a bit lost

An FYI, the GPS does NOT have this erroneous Button2 on GPIO35

Sorry but this sentence doesn't makes much sense for me What GPS are you talking about and what's the relation between a GPS and Button2 og GPIO35 ?

I thought we were talking about InfluxDB So please do not mix different experimentations at the same time and in the same Issue => If the blocking problem really related to InfluxDB (with or without my changes) ? To be sure remove anything that is not about InfluxDB

Thanks

barbudor commented 1 year ago

I can see the blocking on an ESP32 with only InfluxDB (no buttons or what so ever)

It looks like the http_POST remains blocked for 15 seconds after the end of the transaction (influx has replyied with 204 No Content)

therobveiller commented 1 year ago

I can see the blocking on an ESP32 with only InfluxDB (no buttons or what so ever)

I have occasionally seen the same blocking on an ESP8266 with your branch too unfortunately, but it may be completely unrelated to this and is not consistently repeatable for me.

The GPS I mentioned was in reference to the now discredited button2 issue - I had thought that perhaps the reboots were caused by external triggers during influx data send, and the GPS production device which I mentioned has another possible source of an external trigger (as would a button2 be) on which I had seen the reboots, It was never part of the tests here, but I thought it might be relevant if the reboots were happening when there was an influx send - see Theos comments about incorrectly assigned buttons deliberately causing the system to reboot, something I never knew about. Sorry if it caused you confusion, but rather than delete the post, I thought there might be some relevant clues in there that I could not see as the reboots always seemed to occur when there was an influx send.

barbudor commented 1 year ago

I see where the problem is: it takes 15 sec to connect to the server But I have no idea why

I will be off starting Friday evening and may not have much time to investigate more during the coming week. And one that could help me too will also be off for a week. Some delay to be expected here too..

therobveiller commented 1 year ago

I am in no rush as I have a workaround, I am just pleased that you can reproduce the problem that I am seeing. Bon vacances

barbudor commented 1 year ago

@therobveiller Another problem was identified today on WebQuery which is also using the HTTP Client. The problem was related to a new Arduino Core that was introduced in Tasmota 2 weeks ago 🤔

Dev branch has been reverted to before that new Arduino Core so it may be possible that the InfluxDB problem is also solved Could you please make a test using the latest Dev branch ?

Thanks

therobveiller commented 1 year ago

@barbudor I will download and try to allocate some time tomorrow to test and report back. regards Rob

therobveiller commented 1 year ago

@barbudor, I managed to get some time late last night.

Latest development branch used was files timestamped as 4 March at 22:52, downloaded at 13:42 on 5 March. I built 3 versions for ESP32 as follows: build 1) Latest development download, with IPv6 disabled in platformio_tasmota32.ini build 2) Latest development download, with the default IPv6 enabled in platformio_tasmota32.ini build 3) Latest development download, with the default IPv6 enabled in platformio_tasmota32.ini but with your updated xdrv_59_influxdb.ino from your development branch we were testing earlier. I also built latest dev with your influx & IPv6 disabled, but the results were identical to build 3.

For build 1) influx resolves an IPv4 address as would be expected (doesn't show in log, but it connects fine, so logically must be the IPv4 address) [21:22:50]21:22:54.004 IFX: Url http://influx.home:8086/api/v2/write?org=home&bucket=iot-data [21:22:50]21:22:54.016 IFX: Validating connection to http://influx.home:8086/health [21:23:06]21:23:10.215 HTP: Main Menu

For build 2) influx still incorrectly resolves an IPv6 address instead of IPv4 [22:17:45]22:17:49.005 IFX: Url http://influx.home:8086/api/v2/write?org=home&bucket=iot-data [22:17:45]22:17:49.016 IFX: Validating connection to http://influx.home:8086/health [22:17:45]22:17:49.450 WIF: IPv6 Local fe80::3e61:5ff:fe17:26e0 [22:18:15]22:18:19.066 IFX: Error connection refused [22:18:15]22:18:19.075 HTP: Main Menu

For build 3) influx now correctly resolves an IPv4 address, but we still get the 15second delay in connecting to the server for every influx send during which time the unit hangs. [23:08:11]23:08:16.247 WIF: DNS resolved 'influx.home' (172.28.1.1) in 10 ms [23:08:11]23:08:16.249 IFX: Sending [23:08:11]uptimesec,device=tasmota_1726E0,sensor=device value=30 [23:08:11]heap,device=tasmota_1726E0,sensor=device value=151 [23:08:11]sleep,device=tasmota_1726E0,sensor=device value=100 [23:08:11]loadavg,device=tasmota_1726E0,sensor=device value=18 [23:08:11]mqttcount,device=tasmota_1726E0,sensor=device value=1 [23:08:11]heapused,device=tasmota_1726E0,sensor=berry value=3 [23:08:11]objects,device=tasmota_1726E0,sensor=berry value=43 [23:08:11]ap,device=tasmota_1726E0,sensor=wifi value=1 [23:08:11]channel,device=tasmota_1726E0,sensor=wifi value=6 [23:08:11]rssi,device=tasmota_1726E0,sensor=wifi value=100 [23:08:11]signal,device=tasmota_1726E0,sensor=wifi value=-48 [23:08:11]linkcount,device=tasmota_1726E0,sensor=wifi value=1 [23:08:27]23:08:32.529 HTP: Configuration

So in summary, the situation is no different than with 12.4.0 or with your branch that we tested before. The reverted Arduino core does not solve the problem.

The following observations may be of some help I hope:

I noticed that even in build 1 (the current development build that has the reverted arduino core, and with IPv6 disabled by me) and also with v12.4.0, that there is a delay of 15secs on the very first connection to the influx server (where http://influx.home:8086/health is called) after boot. I noticed the same delay when performing the initial configuration on these units. By default ifx is set to 0, when I issue an ifx 1 command (which calls http://influx.home:8086/health), I get a delay of 15 seconds where the unit hangs. Both 12.4.0 & dev (with IPv6 disabled in the build) work fine after that with no delay on influx send. I do not get this delay when issuing an ifx 1 with release 12.3.1, that I get with release 12.4.0 and the current development build. It is only if I introduce your updated influx driver that a delay happens on every send irrespective of IPv6 being disabled in the build in the 12.4.x builds. I also noted that with the 12.4.x builds, if I issue an ifx 0, wait for a while and then issue another ifx 1 I do NOT get the 15 second delay. I only get a delay if I issue an ifx 0, reboot, and then issue an ifx 1. So it seems to me that there is something being stored/cached on that initial influx connection which means there is no delay on subsequent influx send operations - is this a possibility? So most likely there is nothing wrong with the influx changes you made on the face of it as they fix the IPv6/v4 issue, but your changes have highlighted a bug elsewhere that is seen on that initial influx connect. As you are now using that same mechanism as the initial connect on every influx send (or so I believe), we see that same 15sec connect delay on every influx operation which sadly makes it unusable - does this seem logical?

barbudor commented 1 year ago

Thanks I will resume working on this this week

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it hasn't any activity in last few weeks. It will be closed if no further activity occurs. Thank you for your contributions.

barbudor commented 1 year ago

Bump up. I need to restart investigations

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it hasn't any activity in last few weeks. It will be closed if no further activity occurs. Thank you for your contributions.

arendst commented 1 year ago

Bump

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it hasn't any activity in last few weeks. It will be closed if no further activity occurs. Thank you for your contributions.

therobveiller commented 1 year ago

Bump. I guess you guys are currently too busy on Matter protocol at the moment.

arendst commented 1 year ago

Added local DNS lookup (as suggested by @barbudor) to solve possible IPv6 resolves.

Even with this change I still observe the delays on ESP32 at initial connect for both Validating and Dat requests. After that they respond swiftly. I've added IFX: Done after every request to better detect delays.

On ESP8266 the response is always swift.

I think we can conclude the issue is with Arduino Core HTTPClient.

Let me check if we can use Tasmota's own ESP32 HTTPClientLight as a solution.....

barbudor commented 1 year ago

Indeed htat was my conclusion too When the SDK's HTTPClient does the "connect( )" call, the SDK DNS takes ages. Seems there's a timeout that is enforced.

I planned to try with Tasmota's HTTPClightLight but I'm currently snowed under real life work. Just passing by from time to time ;)

Jason2866 commented 1 year ago

@barbudor There is probably something fishy in underlying IDF SDK. Not the first time with issues here. Maybe it is not just the HTTPClient. There is an unsolved issue with DNS from underlying IDF lwip when IPv6 is enabled. https://github.com/espressif/arduino-esp32/issues/8221

arendst commented 1 year ago

Try latest dev branch.

For ESP32 it now uses our light HTTPClient. In my tests it responds swift as the ESP8266 does.

Pls report back.

therobveiller commented 1 year ago

Tried on my testbed ESP32 and all seems to behave properly as far as I can see, with IPv6 enabled in the build, setoption149 set to either 1 or 0, and on a network with no IPv6 address for the influx server. I will now deploy on a production unit to test further, and see if I can give my influx server an IPv6 address again to give it the final test.

Thank you very much for your efforts, they are much appreciated. If all goes well after a few days, I will close the issue.

Rob

therobveiller commented 1 year ago

Seems to be behaving as it should now. It connects via IPv4 even with an IPv6 address on the server & DNS, with setoption149 set to 0, so I am very happy.

Thanks again.

Rob