arendst / Tasmota

Alternative firmware for ESP8266 and ESP32 based devices with easy configuration using webUI, OTA updates, automation using timers or rules, expandability and entirely local control over MQTT, HTTP, Serial or KNX. Full documentation at
https://tasmota.github.io/docs
GNU General Public License v3.0
22.2k stars 4.81k forks source link

Tasmota on ESP32 get stuck after few days of work #15965

Closed Micha70 closed 2 years ago

Micha70 commented 2 years ago

PROBLEM DESCRIPTION

After few days (5-6 days) seems that Tasmota on ESP32 get stuck. No possibility to access the device via Web interface, no MQTT messages anymore. After Power on reset device is working again. Trier with a workaround to have a daily reset, but even herein after about 20 days also get stuck

REQUESTED INFORMATION

Make sure your have performed every step and checked the applicable boxes before submitting your issue. Thank you!

- [x] If using rules, provide the output of this command: `Backlog Rule1; Rule2; Rule3`:
```lua
  Rules output here:
21:30:00.525 MQT: stat/LED-Hausflur/RESULT = {"Rule1":{"State":"ON","Once":"ON","StopOnError":"OFF","Length":35,"Free":476,"Rules":"on Clock#Timer=6 do Restart 1 endon"}}
21:30:00.729 MQT: stat/LED-Hausflur/RESULT = {"Rule2":{"State":"OFF","Once":"OFF","StopOnError":"OFF","Length":0,"Free":511,"Rules":""}}
21:30:00.934 MQT: stat/LED-Hausflur/RESULT = {"Rule3":{"State":"OFF","Once":"OFF","StopOnError":"OFF","Length":0,"Free":511,"Rules":""}}

- [ ] Set `weblog` to 4 and then, when you experience your issue, provide the output of the Console log:

Not possible, because Once Problem appears no access anymore.

### TO REPRODUCE 
Power up and wait several days....

### EXPECTED BEHAVIOUR
Access always possible, device des not get stuck

### SCREENSHOTS
_If applicable, add screenshots to help explain your problem._

### ADDITIONAL CONTEXT
_Add any other context about the problem here._

**(Please, remember to close the issue when the problem has been addressed)**
sfromis commented 2 years ago

That {"Exception":28,"Reason":"LoadProhibited","EPC":"4001a917","EXCVADDR":"0014a55e","CallChain":["4001a914"]} shows that last restart was after a crash, reasons for such can be very varied, potentially related to Bluetooth, which seems to be in use on the device.

Micha70 commented 2 years ago

Yes Bluetooth is used. I can confirm. That's the reason why I'm using an ESP32. Amazing work which was done here. I use it as BLE - MQTT gateway and control the light of my entrance hall via home assistant.

Jason2866 commented 2 years ago

A ESP32-D0WDQ6 rev1 is not a good choice for this task. It is the oldest esp32 rev which exists. There are several improvments made in actual rev.3 Wifi and BT is a stressing use case for the esp32. Old rev has probably issues to do this stable. Use the actual esp32 rev.3 for this.

Micha70 commented 2 years ago

@Jason2866 thanks for the hint. Can you give a recommendation where to buy rev.3? I checked a bit, and where rev was mentioned was almost rev1

Jason2866 commented 2 years ago

Search for a board with esp32 Pico D4. The pico variant is very new.

Micha70 commented 2 years ago

I have now received a board with revision 3 and will soon update my device. Just to mention one observation I did with the origin board (ESP32-D0WDQ6 rev1). Initially was getting stuck about once per week until I changed to version 12.0.1. With 12.0.1 I had problem nearly each day. Last weekend I have changed again to version 11.1.0 and since than I did not had any stuck.... just for your information. Following now your hint and change to new chip ESP32D0WDQ5 (revision 3) and update to version 12.0.2. I will post my observations.

flinsc commented 2 years ago

In my Esp32 projects I noticed an increase in stability after disabling the internal temperature sensor. You could try this on your setup: SetSensor127 0

Micha70 commented 2 years ago

Feedback: After seven days I did not saw anymore that device got stuck. So seems with different ESP chip version (ESP32-D0WD-V3 rev.3) the problem was solved. I have removed also 5 days ago the daily reset by programmed rule. So everthing now works fine. What I have discovered is that the device seem to get each day a reset by itself at about 4am in the morning (according uptime): grafik

Is this intended? I will check tomorrow if also Boot counter is encreased....

Micha70 commented 2 years ago

STATUS 0 output here:

07:27:58.983 CMD: status 0
07:27:59.001 MQT: stat/LED-Hausflur/STATUS = {"Status":{"Module":0,"DeviceName":"LED-Neu","FriendlyName":["LED-Hausflur-Neu"],"Topic":"LED-Hausflur","ButtonTopic":"0","Power":1,"PowerOnState":3,"LedState":1,"LedMask":"FFFF","SaveData":1,"SaveState":1,"SwitchTopic":"0","SwitchMode":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"ButtonRetain":0,"SwitchRetain":0,"SensorRetain":1,"PowerRetain":0,"InfoRetain":0,"StateRetain":0}}
07:27:59.012 MQT: stat/LED-Hausflur/STATUS1 = {"StatusPRM":{"Baudrate":115200,"SerialConfig":"8N1","GroupTopic":"tasmotas","OtaUrl":"http://ota.tasmota.com/tasmota32/release-11.1.0/tasmota32-bluetooth.bin","RestartReason":"Software reset CPU","Uptime":"0T03:35:11","StartupUTC":"2022-07-30T01:52:48","Sleep":100,"CfgHolder":4617,"BootCount":15,"BCResetTime":"2022-04-09T12:07:39","SaveCount":11247}}
07:27:59.022 MQT: stat/LED-Hausflur/STATUS2 = {"StatusFWR":{"Version":"12.0.2.4(bluetooth)","BuildDateTime":"2022-07-22T13:15:19","Core":"2_0_4","SDK":"v4.4.2","CpuFrequency":80,"Hardware":"ESP32-D0WD-V3 rev.3","CR":"442/699"}}
07:27:59.031 MQT: stat/LED-Hausflur/STATUS3 = {"StatusLOG":{"SerialLog":2,"WebLog":2,"MqttLog":0,"SysLog":0,"LogHost":"","LogPort":514,"SSId":["ollahMuz","ollahMuz_K"],"TelePeriod":300,"Resolution":"558180C0","SetOption":["00008209","2805C80001000600003C5A0A002800000000","00000280","00006000","00004002","00000000"]}}
07:27:59.053 MQT: stat/LED-Hausflur/STATUS4 = {"StatusMEM":{"ProgramSize":1532,"Free":1347,"Heap":77,"StackLowMark":3,"PsrMax":0,"PsrFree":0,"ProgramFlashSize":4096,"FlashSize":4096,"FlashChipId":"16405E","FlashFrequency":40,"FlashMode":3,"Features":["00000809","8F9AC7C7","00148001","000000CF","010013C0","C0000981","00004080","00200000","5400082C","00000000"],"Drivers":"1,2,3,4,5,7,8,9,10,12,16,20,21,24,26,27,29,35,38,50,52,59,62,79,85","Sensors":"1,2,3,5,6,52,62,127"}}
07:27:59.070 MQT: stat/LED-Hausflur/STATUS5 = {"StatusNET":{"Hostname":"LED-Hausflur-7916","IPAddress":"192.168.178.80","Gateway":"192.168.178.1","Subnetmask":"255.255.255.0","DNSServer1":"192.168.178.50","DNSServer2":"0.0.0.0","Mac":"24:D7:EB:0D:FE:EC","Webserver":2,"HTTP_API":1,"WifiConfig":4,"WifiPower":17.0}}
07:27:59.085 MQT: stat/LED-Hausflur/STATUS6 = {"StatusMQT":{"MqttHost":"192.168.178.50","MqttPort":1883,"MqttClientMask":"DVES_%06X","MqttClient":"DVES_0DFEEC","MqttUser":"Micha","MqttCount":1,"MAX_PACKET_SIZE":1200,"KEEPALIVE":30,"SOCKET_TIMEOUT":4}}
07:27:59.100 MQT: stat/LED-Hausflur/STATUS7 = {"StatusTIM":{"UTC":"2022-07-30T05:27:59","Local":"2022-07-30T07:27:59","StartDST":"2022-03-27T02:00:00","EndDST":"2022-10-30T03:00:00","Timezone":99,"Sunrise":"05:40","Sunset":"20:53"}}
07:27:59.114 MQT: stat/LED-Hausflur/STATUS10 = {"StatusSNS":{"Time":"2022-07-30T07:27:59","ESP32":{"Temperature":41.1},"TempUnit":"C"}}
07:27:59.130 MQT: stat/LED-Hausflur/STATUS11 = {"StatusSTS":{"Time":"2022-07-30T07:27:59","Uptime":"0T03:35:11","UptimeSec":12911,"Heap":80,"SleepMode":"Dynamic","Sleep":10,"LoadAvg":99,"MqttCount":1,"Berry":{"HeapUsed":3,"Objects":41},"POWER":"ON","Dimmer":71,"Color":"3A09B5","HSBColor":"257,95,71","Channel":[23,3,71],"Scheme":0,"Fade":"OFF","Speed":1,"LedTable":"ON","Wifi":{"AP":1,"SSId":"ollahMuz","BSSId":"2C:91:AB:9A:FD:8E","Channel":4,"Mode":"11n","RSSI":100,"Signal":-46,"LinkCount":1,"Downtime":"0T02:07:33"}}}'
barbudor commented 2 years ago

There should be no reason for Tasmota to self-restart itself every day. So if it's not your rule, may be something external It could be interresting to get level 4 logs just before the restart by using serial console, syslog or mqtt log

TD-er commented 2 years ago

Just for debugging purposes, can you prevent the reboot (which looks to me like a crash, maybe based on DHCP renew?) with some ping running to that node from just about any node in your network? Sending pings continuously does change the power modes an ESP may enter when idling. This may lead to MAC tables in switches and APs forgetting to what port packets for that node have to be sent and then all kinds of things may happen. (no idea if Tasmota has sending Gratuitous ARP enabled by default, but if it is an option, you may also want to test that.

Does your router allow you to see when a DHCP lease expires or when it was renewed? I know Mikrotik does this and perhaps others too.

Jason2866 commented 2 years ago

@TD-er Gratuitous ARP is enabled by default in Tasmota. To be more precise Gratuitous ARP is enabled by default in Tasmota ESP32 Arduino framework (i think in official too). It is a IDF sdkconfig option.

TD-er commented 2 years ago

OK, I will have a look at the implementation on how it is implemented.

In ESPEasy I do send out those ARP packets at some interval, which is dynamic. The interval is set quite low (and an ARP packets is immediately sent) when some kind of network error (e.g. timeout) occurs and right after making a connection to the network. Then on each interval, the interval duration is increased to a max of about 5 minutes.

Jason2866 commented 2 years ago

Closing, issue is not reproducable on actual esp32 revisions. My test esp32 is running since 10 days without a reboot/problem. Feel free to reopen if you still have the problem and logs can be provided

Micha70 commented 2 years ago

Hello @Jason2866, is ok for me to close the ticket. Just one hint. The reset I see only if WLAN is switched off for some time. I have it time controlled switched off, from 1am to 6am. between 3am and 4am the ESP get always the reset. I see also the increment in reboot count. If I don't switch off the WLAN I don't see a reboot during this night and also no increment in reboot counter.

h4n23s commented 10 months ago

I'm also running into this issue with a very similar use case (relaying BLE messages via MQTT). My board crashes regularly, at least once in 2 to 3 days. WiFi is disabled on boot via Tasmota Rules as I'm using an ethernet module (LAN8720).

EspTool tells me that this is my chip: ESP32-D0WD-V3 (revision v3.0)

Edit: I'm currently running Tasmota 13.1.0