Closed JosefRypacek closed 1 month ago
I have a similar problem with a Shelly Plus 2PM with 14.0.0. In the ShutterSetup process the Shelly restarts when the shutter reaches the top position.
@maltic42 Don't know details, but it seems like a different issue. This happens on properly configured device after several weeks of usage and is not caused by top/bottom position.
I can confirm that i see an issue in the mqtt connection after 3-4 hours, update to 14.0.0.4 seems to solve the problem. @JosefRypacek maybe try the update to beta on one device to confirm.
You are right I'm using MQTT and HA integration, but in my case it take ~ 1-2 months before hang and it affects MQTT, buttons in Tasmota UI and physical switch buttons. I've just updated to 14.0, but still I would like to know if I can do something to debug my issue better next time.
After "hang" whole web UI seems working, you "only" can't move with shutters and after multiple clicks to DOWN or UP button device is reseted.
ave a similar problem with a Shelly Plus 2PM with 14.0.0. In t
The shutter setup is a completely different animal. It is known that restart could happen here. I still haven figured out how to catch the crash. It is issued by frequent reading of the current measure to find the right time where the shutter stop. There is NO known crash on normal operation
I checked with current version to "force" a crash by frequent pushing the DOWN button or by frequent using the WebUI DOWN button. I was not able to get one crash. Therefore very difficult to analyze.
I am a little confused because I do not see the MQTT message? Here the log how it should look like. Maybe you have an MQTT server issue:
08:27:02.395 SRC: WebGui from 192.168.178.59
08:27:02.397 SRC: WebGui from 192.168.178.59
08:27:02.399 CMD: Grp 0, Cmd 'SHUTTERPOSITION', Idx 2, Len 9, Pld -99, Data 'StopClose'
08:27:02.420 MQT: hm/status/ESP_F506CC/RESULT = {"Shutter2":{"Position":0,"Direction":0,"Target":0,"Tilt":0}}
08:27:02.558 SRC: WebGui from 192.168.178.59
08:27:02.561 SRC: WebGui from 192.168.178.59
08:27:02.562 CMD: Grp 0, Cmd 'SHUTTERPOSITION', Idx 2, Len 9, Pld -99, Data 'StopClose'
08:27:02.583 MQT: hm/status/ESP_F506CC/RESULT = {"Shutter2":{"Position":0,"Direction":0,"Target":0,"Tilt":0}}
Yes, it can't be crashed by the mentioned steps anytime. The device has to be in some "bad" state which occurs at the same time for all devices (so I think it's related to Wi-Fi restart, MQTT restart or some spikes in power).
In the mentioned output I didn't used MQTT, but just WebGUI to verify it's not caused by MQTT server.
And after three SHUTTERPOSITION
commands device gets restarted. Anyway, I've updated all devices and they are runing fine, so I will update this issue when/if it happens again.
We see frequently with some Chinese devices that the LDO doing the 3.3V is quite noisy and not powerful enough. Sometimes it helps to put a 470microFarad capacitor between GROUND and 3.3V. The Shelly normally do not have these problems. I have many of them without any problem. The Sonoff I would not advertise.
Might be a power problem. I added to all my shutter ESPs a 6700µF 6V capacitor and never had any problems. Might be oversized but works.
@jtauscher : Please can you close the issue as it looks like this was an electrical problem we cannot address with a software change. THX
@stefanbode I'll close the task, but want to note that this is probably NOT related to electrical problem. Only devices with shutter configured are affected, other devices of the same type with non-shutter configuration are OK. But agree it's difficult to diagnose.
I have been experiencing the same issue every few weeks/months, the last occurrence was 2 days ago. So maybe I can add a few bits and pieces that possibly help troubleshooting this sporadic bug.
The problem appears to be only on the devices running shutters: the shutterposition is apparently lost as well as rule activation (rules are disabled after the event).
Devices affected: 10x Sonoff Dual R2 (Tasmota 14.1.0) 1x Sonoff Touch (Tasmota 14.1.0)
Devices NOT affected: 1x Sonoff Dual R2 (Tasmota 13.3.0.2) (on the same power line as 3 of the affected devices above) 2x China no-name Touch switches (Tasmota 14.1.0) Several Gosund Power Plugs (Tasmota 14.1.0)
Assuming the "survival" of that single Dual R2 is not pure coincidence: JosefRypacek's Sonoffs were running 13.4.0 when they failed and my surviving Dual R2 was on 13.3.0. Maybe something changed in the Source Code between 13.3 and 13.4 that could be the cause or enabler of these problems?
It happened again on 14.0.0... It still seem like shutter related functionality to me, but I have to admit it's weird that all devices with enabled shutter functionality are hit at the same time. There was no power or WiFi outage today. In the morning blinds went up, but in the evening didn't go down.
I tried status 0
(successful), weblog 3
(successful), shutterposition2
(successful), but when I run shutterposition
there is a delay in the console and sometimes (first or second try) it will display the CMD line after 5-10s. On another call it will display boot info instead. I've also found that browser won't get any response for request with shutterposition
command. After boot there were following messages in the log.
20:14:14.664 MQT: tele/tasmota_495E24/INFO3 = {"Info3":{"RestartReason":{"Exception":1,"Reason":"Syscall","EPC":"40092066","EXCVADDR":"00000000","CallChain":["40092063","400927d1","40091c6c","40091c1e"]},"BootCount":30}}
20:18:04.858 MQT: tele/tasmota_48D498/INFO3 = {"Info3":{"RestartReason":{"Exception":1,"Reason":"Syscall","EPC":"40092066","EXCVADDR":"00000000","CallChain":["40092063","400927d1","40091c6c","40091c1e"]},"BootCount":33}}
20:30:49.908 MQT: tele/tasmota_4B34F0/INFO3 = {"Info3":{"RestartReason":{"Exception":2,"Reason":"InstructionFetchError","EPC":"40091e72","EXCVADDR":"ffffffe0","CallChain":["40091e6f","401374ed","40091e6f","40091e6f","40091e6f","40091e6f","40091e6f","40091e6f","4008007d","4020a6e0","40091e6f","40091e6f","40091e6f","40091e6f","40091e6f","40091e6f","4008007d","4020a6e0","40091e6f","40091e6f","40091e6f","40091e6f","40091e6f","40091e6f","4008007d","4020a6e0","40091e6f","40091e6f","40091e6f","40091e6f","40091e6f","40091e6f","4008007d","4020a6e0","40091e6f","40091e6f","40091e6f","40091e6f","40091e6f","40091e6f","4008007d","4020a6e0","40091e6f","40091e6f","40091e6f","40091e6f","40091e6f","40091e6f"]},"BootCount":33}}
I probably have another 2 devices in this state (not sure how long they will remain) if you would need any additional data.
Hi @JosefRypacek . I can see with the Exception1 that this is definitely something in the TASMOTA firmware that causes the boot and restart. The observed missing response in the browser is most likely the reboot that the system is doing after receiving the shutterposition command. I need following information to reproduce the error here:
Backlog Template; Module; GPIO 255 You provided some weeks ago this. Still valid??
20:21:48.840 MQT: stat/tasmota_48D498/RESULT = {"NAME":"Sonoff Dual R3","GPIO":[34,0,0,0,0,0,0,0,0,576,225,0,0,0,0,0,0,0,0,0,0,7296,7328,224,0,0,0,0,32,33,0,0,0,0,0,0],"FLAG":0,"BASE":1} 20:21:49.074 MQT: stat/tasmota_48D498/RESULT = {"Module":{"0":"Sonoff Dual R3"}} 20:21:49.333 MQT: stat/tasmota_48D498/RESULT = {"GPIO0":{"34":"Button3"},"GPIO1":{"0":"None"},"GPIO2":{"0":"None"},"GPIO3":{"0":"None"},"GPIO4":{"0":"None"},"GPIO5":{"0":"None"},"GPIO6":{"0":"None"},"GPIO7":{"0":"None"},"GPIO8":{"0":"None"},"GPIO9":{"0":"None"},"GPIO10":{"0":"None"},"GPIO11":{"0":"None"},"GPIO12":{"0":"None"},"GPIO13":{"576":"LedLink_i"},"GPIO14":{"225":"Relay2"},"GPIO15":{"0":"None"},"GPIO16":{"0":"None"},"GPIO17":{"0":"None"},"GPIO18":{"0":"None"},"GPIO19":{"0":"None"},"GPIO20":{"0":"None"},"GPIO21":{"0":"None"},"GPIO22":{"0":"None"},"GPIO23":{"0":"None"},"GPIO24":{"0":"None"},"GPIO25":{"7296":"CSE7761 Tx"},"GPIO26":{"7328":"CSE7761 Rx"},"GPIO27":{"224":"Relay1"},"GPIO32":{"32":"Button1"},"GPIO33":{"33":"Button2"},"GPIO34":{"0":"None"},"GPIO35":{"0":"None"},"GPIO36":{"0":"None"},"GPIO37":{"0":"None"},"GPIO38":{"0":"None"},"GPIO39":{"0":"None"}}
Please provide the output. If there are additional rules defined that might have an influence please also share these:
Backlog Rule1; Rule2; Rule3
I do see that in the inital report of the case you already provided some info like this on:
20:13:19.505 MQT: stat/tasmota_48D498/STATUS13 = {"StatusSHT":{"SHT0":{"Relay1":1,"Relay2":2,"Open":360,"Close":360,"50perc":50,"Delay":2,"Opt":"0100","Calib":[300,500,700,900,1000],"Mode":"1","TiltConfig":[-90,0,19,0,-90]}}}
This is only ONE shutter and above you're talking about shutterposition2. This does not match.
It is very important that we get the information of an effected shutter. In the past we had an overflow with long run times like 36sek. But this was fixed years ago. Therefore it is very important to have the exact parameters of the shutter that causes problems
I have ordered a dual R3 on amazon. should be there tomorrow. maybe something special with the measurement function of the power and current.
Hi, yes there is nothing changed in the configuration and there are no rules.
I have only one shutter but I tried to call shutterposition2
which returns {"Command":"Error"}
but it does not crash the tasmota! the shutterposition1
will not respond / will crash the tasmota.
20:28:46.612 CMD: Backlog Template; Module; GPIO 255
20:28:46.651 MQT: stat/tasmota_4B34F0/RESULT = {"NAME":"Sonoff Dual R3","GPIO":[34,0,0,0,0,0,0,0,0,576,225,0,0,0,0,0,0,0,0,0,0,7296,7328,224,0,0,0,0,32,33,0,0,0,0,0,0],"FLAG":0,"BASE":1}
20:28:46.867 MQT: stat/tasmota_4B34F0/RESULT = {"Module":{"0":"Sonoff Dual R3"}}
20:28:47.126 MQT: stat/tasmota_4B34F0/RESULT = {"GPIO0":{"34":"Button3"},"GPIO1":{"0":"None"},"GPIO2":{"0":"None"},"GPIO3":{"0":"None"},"GPIO4":{"0":"None"},"GPIO5":{"0":"None"},"GPIO6":{"0":"None"},"GPIO7":{"0":"None"},"GPIO8":{"0":"None"},"GPIO9":{"0":"None"},"GPIO10":{"0":"None"},"GPIO11":{"0":"None"},"GPIO12":{"0":"None"},"GPIO13":{"576":"LedLink_i"},"GPIO14":{"225":"Relay2"},"GPIO15":{"0":"None"},"GPIO16":{"0":"None"},"GPIO17":{"0":"None"},"GPIO18":{"0":"None"},"GPIO19":{"0":"None"},"GPIO20":{"0":"None"},"GPIO21":{"0":"None"},"GPIO22":{"0":"None"},"GPIO23":{"0":"None"},"GPIO24":{"0":"None"},"GPIO25":{"7296":"CSE7761 Tx"},"GPIO26":{"7328":"CSE7761 Rx"},"GPIO27":{"224":"Relay1"},"GPIO32":{"32":"Button1"},"GPIO33":{"33":"Button2"},"GPIO34":{"0":"None"},"GPIO35":{"0":"None"},"GPIO36":{"0":"None"},"GPIO37":{"0":"None"},"GPIO38":{"0":"None"},"GPIO39":{"0":"None"}}
Also please note that my devices are R3 (v1) and there seems to be v2 already.
Now the important information which may actually resolve this issue. If you look into my first report, the device reports uptime just above 50 days
. Yesterday when I observed devices are stuck I found "Uptime":"50T05:19:50"
. It does not seems to be coincidence. 2^32 ms
is 49.7 days
. Seems like issue with uint
somewhere. :)
I also found that first call of shutterpositon
with missing response in the browser won't reboot the device and the device can respond to another commands. but second call will cause the reboot. Will send you more details in a mail.
I now have a flashed sonoff dual R3 v1.6 to test 1:1. I have configured the device same as you und currently I do not see any problem. There is no load on the relay. No Motor connected. Testing several DOWN do not show any issue. shutterposition and shutterposition1 also execute excellent with direct report to the UI. I will now add it with MQTT.
Theo shared some images with devices up and running 289days. Therefore this is not the reason
You probably won't be able to reproduce it earlier than after 50 days. So leave it connected. :) My devices works great after reset too! Hope you got my mail with details and video.
At the end of May my device was working correctly in the morning (with uptime approx. 49 days and 12 hours). In the evening (50 days and 1 hour) I found it's stuck. I realized that it happened at least 1 time few months earlier. Now in August I got exactly the same issue on all devices at once. They worked correctly in the morning (49 days 14 hours, but was stuck in the evening (50 days 5 hours). The interesting thing is that 2^32 [ms] is exactly between point where it was working and when it stopped to work. It's related only to shutters (with tilt), the rest of the device is unaffected.
I sometimes need to turn off the power or update devices, so next occurrence won't be exactly after 50 days, but if this task won't be solved until then, I will update with next occurrence.
I already fixed one issue this morning that can cause an overflow. But will check later on if any Millies() have the potential either. It is not unlikely because I do use this quite extensivly.
Ok, now we are getting close. I submitted a new fix #21966 and this looks more promising for the case. There is a function that ensure you give the motor a small time to rest before operating e.g. in the other direction. Here we store the last motorstoptime as a 2^32 variable. If the online time if above 49days first nothing happens UNTIL you operate the shutter again. Then millis() is very low and the code expected it to be slightly higher than the last stop time. This causes an infinite (49days) loop and before that happen the crash detection of TASMOTA reboots all the things.
Thanks for bringing this up and help to make it more reliable.
@arendst : This i very likely the issue. Thanks for guiding me into the right direction
That seems exactly like my observation - Tasmota won't crash on first shutter operate command (e.g. button down), but on second or third. Thanks for fixing it and make my home more reliable.
PROBLEM DESCRIPTION
I have several devices and few times a year all my devices with shutter function enabled hangs at once. Other devices without shutters works fine at the same time.
Today I found I can't control my blinds using HA and MQTT and it seems Tasmota got stucked again. I connected to the Tasmota web GUI, enabled debug level, opened console in separate tab and pressed DOWN button multiple times. (If I would press in only once, nothing will happen and device won't be restarted.) I copied console log before and after the reset. This device is not on the newest firmware, as it's hard to reproduce.
I would like to know it provided output is enough to analyze and possible fix, or I should do something more next time.
REQUESTED INFORMATION
Make sure your have performed every step and checked the applicable boxes before submitting your issue. Thank you!
Backlog Template; Module; GPIO 255
:20:21:48.840 MQT: stat/tasmota_48D498/RESULT = {"NAME":"Sonoff Dual R3","GPIO":[34,0,0,0,0,0,0,0,0,576,225,0,0,0,0,0,0,0,0,0,0,7296,7328,224,0,0,0,0,32,33,0,0,0,0,0,0],"FLAG":0,"BASE":1} 20:21:49.074 MQT: stat/tasmota_48D498/RESULT = {"Module":{"0":"Sonoff Dual R3"}} 20:21:49.333 MQT: stat/tasmota_48D498/RESULT = {"GPIO0":{"34":"Button3"},"GPIO1":{"0":"None"},"GPIO2":{"0":"None"},"GPIO3":{"0":"None"},"GPIO4":{"0":"None"},"GPIO5":{"0":"None"},"GPIO6":{"0":"None"},"GPIO7":{"0":"None"},"GPIO8":{"0":"None"},"GPIO9":{"0":"None"},"GPIO10":{"0":"None"},"GPIO11":{"0":"None"},"GPIO12":{"0":"None"},"GPIO13":{"576":"LedLink_i"},"GPIO14":{"225":"Relay2"},"GPIO15":{"0":"None"},"GPIO16":{"0":"None"},"GPIO17":{"0":"None"},"GPIO18":{"0":"None"},"GPIO19":{"0":"None"},"GPIO20":{"0":"None"},"GPIO21":{"0":"None"},"GPIO22":{"0":"None"},"GPIO23":{"0":"None"},"GPIO24":{"0":"None"},"GPIO25":{"7296":"CSE7761 Tx"},"GPIO26":{"7328":"CSE7761 Rx"},"GPIO27":{"224":"Relay1"},"GPIO32":{"32":"Button1"},"GPIO33":{"33":"Button2"},"GPIO34":{"0":"None"},"GPIO35":{"0":"None"},"GPIO36":{"0":"None"},"GPIO37":{"0":"None"},"GPIO38":{"0":"None"},"GPIO39":{"0":"None"}}
Status 0
:20:13:19.041 MQT: stat/tasmota_48D498/STATUS = {"Status":{"Module":0,"DeviceName":"zaluzie-kuchyne","FriendlyName":["Tasmota",""],"Topic":"tasmota_48D498","ButtonTopic":"0","Power":0,"PowerOnState":3,"LedState":1,"LedMask":"FFFF","SaveData":1,"SaveState":1,"SwitchTopic":"0","SwitchMode":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"ButtonRetain":0,"SwitchRetain":0,"SensorRetain":0,"PowerRetain":0,"InfoRetain":0,"StateRetain":0,"StatusRetain":0}} 20:13:19.080 MQT: stat/tasmota_48D498/STATUS1 = {"StatusPRM":{"Baudrate":115200,"SerialConfig":"8N1","GroupTopic":"tasmotas","OtaUrl":"http://ota.tasmota.com/tasmota32/release/tasmota32.bin","RestartReason":"Software reset CPU","Uptime":"0T00:04:57","StartupUTC":"2024-05-20T19:08:22","Sleep":50,"CfgHolder":4617,"BootCount":29,"BCResetTime":"2023-07-09T22:35:15","SaveCount":457}} 20:13:19.121 MQT: stat/tasmota_48D498/STATUS2 = {"StatusFWR":{"Version":"13.4.0(tasmota32)","BuildDateTime":"2024-02-14T16:17:40","Core":"2_0_14","SDK":"4.4.6.240105","CpuFrequency":160,"Hardware":"ESP32-D0WD-V3 v3.0","CR":"405/699"}} 20:13:19.139 MQT: stat/tasmota_48D498/STATUS3 = {"StatusLOG":{"SerialLog":2,"WebLog":2,"MqttLog":0,"SysLog":0,"LogHost":"","LogPort":514,"SSId":["RtIOT",""],"TelePeriod":300,"Resolution":"558180C0","SetOption":["0000C009","2805C80001000600003C5A0A192800000000","40800080","00006000","00004000","00000000"]}} 20:13:19.203 MQT: stat/tasmota_48D498/STATUS4 = {"StatusMEM":{"ProgramSize":1907,"Free":972,"Heap":143,"StackLowMark":3,"PsrMax":0,"PsrFree":0,"ProgramFlashSize":4096,"FlashSize":4096,"FlashChipId":"1640EF","FlashFrequency":40,"FlashMode":"DIO","Features":["0809","9F9AD7DF","0015A001","B7F7BFCF","05DA9BC4","E0360DC7","480840D2","20200000","D4BC482D","810A80B1","00000004"],"Drivers":"1,2,3,!4,!5,7,!8,9,10,11,12,!14,!16,!17,!20,!21,!24,26,27,29,!34,!35,38,50,52,!59,!60,62,!63,!66,!67,!68,82,!86,!87,!88,!121","Sensors":"1,2,3,5,6,7,8,9,10,11,12,13,14,15,17,18,19,20,21,22,26,31,34,37,39,40,42,43,45,51,52,55,56,58,59,64,66,67,74,85,92,95,98,103,105,109,127","I2CDriver":"7,8,9,10,11,12,13,14,15,17,18,20,24,29,31,36,41,42,44,46,48,58,62,65,69,76,77,82"}} 20:13:19.277 MQT: stat/tasmota_48D498/STATUS5 = {"StatusNET":{"Hostname":"tasmota-48D498-5272","IPAddress":"1.1.1.44","Gateway":"1.1.1.1","Subnetmask":"255.255.255.0","DNSServer1":"1.1.1.1","DNSServer2":"0.0.0.0","Mac":"3C:61:05:48:D4:98","IP6Global":"","IP6Local":"fe80::3e61:5ff:fe48:d498%st1","Ethernet":{"Hostname":"","IPAddress":"0.0.0.0","Gateway":"0.0.0.0","Subnetmask":"0.0.0.0","DNSServer1":"1.1.1.1","DNSServer2":"0.0.0.0","Mac":"00:00:00:00:00:00","IP6Global":"","IP6Local":""},"Webserver":2,"HTTP_API":1,"WifiConfig":4,"WifiPower":16.0}} 20:13:19.327 MQT: stat/tasmota48D498/STATUS6 = {"StatusMQT":{"MqttHost":"mqtt.rypi.net","MqttPort":1883,"MqttClientMask":"DVES%06X","MqttClient":"DVES_48D498","MqttUser":"DVES_USER","MqttCount":1,"MAX_PACKET_SIZE":1200,"KEEPALIVE":30,"SOCKET_TIMEOUT":4}} 20:13:19.356 MQT: stat/tasmota_48D498/STATUS7 = {"StatusTIM":{"UTC":"2024-05-20T19:13:19Z","Local":"2024-05-20T20:13:19","StartDST":"2024-03-31T02:00:00","EndDST":"2024-10-27T03:00:00","Timezone":"+01:00","Sunrise":"05:01","Sunset":"20:31"}} 20:13:19.374 MQT: stat/tasmota_48D498/STATUS9 = {"StatusPTH":{"PowerDelta":[0,0,0,0,0,0,0,0],"PowerLow":0,"PowerHigh":0,"VoltageLow":0,"VoltageHigh":0,"CurrentLow":0,"CurrentHigh":0}} 20:13:19.403 MQT: stat/tasmota_48D498/STATUS10 = {"StatusSNS":{"Time":"2024-05-20T20:13:19","ENERGY":{"TotalStartTime":"2023-07-09T22:43:02","Total":0.852,"Yesterday":0.002,"Today":0.001,"Power":[0,0],"ApparentPower":[0,0],"ReactivePower":[0,0],"Factor":[0.00,0.00],"Voltage":0,"Current":[0.000,0.000]},"Shutter1":{"Position":100,"Direction":0,"Target":100,"Tilt":0}}} 20:13:19.446 MQT: stat/tasmota_48D498/STATUS11 = {"StatusSTS":{"Time":"2024-05-20T20:13:19","Uptime":"0T00:04:57","UptimeSec":297,"Heap":145,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":23,"MqttCount":1,"Berry":{"HeapUsed":4,"Objects":47},"POWER1":"OFF","POWER2":"OFF","Wifi":{"AP":1,"SSId":"RtIOT","BSSId":"3A:60:F9:54:03:6E","Channel":1,"Mode":"11n","RSSI":68,"Signal":-66,"LinkCount":1,"Downtime":"0T00:00:04"}}} 20:13:19.485 MQT: stat/tasmota_48D498/STATUS12 = {"StatusSTK":{"Exception":2,"Reason":"InstructionFetchError","EPC":"40091aa6","EXCVADDR":"ffffffe0","CallChain":["40091aa3","3ffb034d"]}} 20:13:19.505 MQT: stat/tasmota_48D498/STATUS13 = {"StatusSHT":{"SHT0":{"Relay1":1,"Relay2":2,"Open":360,"Close":360,"50perc":50,"Delay":2,"Opt":"0100","Calib":[300,500,700,900,1000],"Mode":"1","TiltConfig":[-90,0,19,0,-90]}}}
TO REPRODUCE
Not known.
EXPECTED BEHAVIOUR
Not hang.
SCREENSHOTS
N/A
ADDITIONAL CONTEXT
N/A
(Please, remember to close the issue when the problem has been addressed)