arendst / Tasmota

Alternative firmware for ESP8266 and ESP32 based devices with easy configuration using webUI, OTA updates, automation using timers or rules, expandability and entirely local control over MQTT, HTTP, Serial or KNX. Full documentation at
https://tasmota.github.io/docs
GNU General Public License v3.0
21.85k stars 4.75k forks source link

Sonoff devices reboot every ~20 minutes with "Exception":29 caused by NTP multicast/manycast #16061

Closed darkdragondraco closed 1 year ago

darkdragondraco commented 2 years ago

PROBLEM DESCRIPTION

Sonoff devices after upgrade to Tasmota 12.0.2 reboot every ~20 minutes with StatusSTK":{"Exception":29,"Reason":"Exception","EPC":["40252ca8",...

REQUESTED INFORMATION

Make sure your have performed every step and checked the applicable boxes before submitting your issue. Thank you!

- [x] If using rules, provide the output of this command: `Backlog Rule1; Rule2; Rule3`:
```lua
  Rules output here:
13:43:03.435 CMD: Backlog Rule1; Rule2; Rule3
13:43:03.437 SRC: WebConsole from 192.168.2.143
13:43:03.439 CMD: Grp 0, Cmd 'BACKLOG', Idx 1, Len 19, Pld -99, Data 'Rule1; Rule2; Rule3'
13:43:03.461 SRC: Backlog
13:43:03.464 CMD: Grp 0, Cmd 'RULE', Idx 1, Len 0, Pld -99, Data ''
13:43:03.472 MQT: stat/sf-s55-01/RESULT = {"Rule1":{"State":"OFF","Once":"OFF","StopOnError":"OFF","Length":0,"Free":511,"Rules":""}}
13:43:03.680 SRC: Backlog
13:43:03.683 CMD: Grp 0, Cmd 'RULE', Idx 2, Len 0, Pld -99, Data ''
13:43:03.689 MQT: stat/sf-s55-01/RESULT = {"Rule2":{"State":"OFF","Once":"OFF","StopOnError":"OFF","Length":0,"Free":511,"Rules":""}}
13:43:03.930 SRC: Backlog
13:43:03.933 CMD: Grp 0, Cmd 'RULE', Idx 3, Len 0, Pld -99, Data ''
13:43:03.940 MQT: stat/sf-s55-01/RESULT = {"Rule3":{"State":"OFF","Once":"OFF","StopOnError":"OFF","Length":0,"Free":511,"Rules":""}}
- [x] Set `weblog` to 4 and then, when you experience your issue, provide the output of the Console log:
```lua
  Console output here:

Console output after reboot

 00:00:00.002 HDW: ESP8285
00:00:00.051 CFG: Loaded from flash at F9, Count 121
00:00:00.056 QPC: Count 1
00:00:00.058 CFG: CR 343/699, Busy 0
00:00:00.062 ROT: Mode 1
00:00:00.064 SRC: Restart
00:00:00.066 Project tasmota - sf-s55-01 Version 12.0.2.4(tasmota)-2_7_4_9(2022-07-22T13:15:05)
00:00:00.219 WIF: Checking connection...
00:00:00.219 WIF: Attempting connection...
00:00:00.555 WIF: Connecting to AP1 Shadow Channel 1 BSSId 48:8F:5A:21:1E:27 in mode 11n as sf-s55-01-1694...
00:00:01.753 WIF: Checking connection...
00:00:01.754 WIF: Connected
00:00:02.006 HTP: Web server active on sf-s55-01-1694 with IP address 192.168.2.136
00:00:02.403 WIF: Sending Gratuitous ARP
00:00:02.405 NTP: Sync time...
00:00:02.519 RTC: UTC 2022-07-24T11:12:40, DST 2022-03-27T02:00:00, STD 2022-10-30T03:00:00
14:12:40.000 RTC: Synced by NTP
14:12:41.023 MQT: Attempting connection...
14:12:41.035 MQT: Connected
14:12:41.038 MQT: tele/sf-s55-01/LWT = Online (retained)
14:12:41.041 MQT: cmnd/sf-s55-01/POWER = 
14:12:41.043 MQT: Subscribe to cmnd/sf-s55-01/#
14:12:41.047 MQT: Subscribe to cmnd/tasmotas/#
14:12:41.049 SRC: MQTT
14:12:41.051 CMD: Grp 0, Cmd 'POWER', Idx 1, Len 0, Pld -99, Data ''
14:12:41.055 MQT: stat/sf-s55-01/RESULT = {"POWER":"OFF"}
14:12:41.058 MQT: stat/sf-s55-01/POWER = OFF
14:12:41.060 MQT: Subscribe to cmnd/DVES_10E69E_fb/#
14:12:41.066 MQT: tele/sf-s55-01/INFO1 = {"Info1":{"Module":"Sonoff S55","Version":"12.0.2.4(tasmota)","FallbackTopic":"cmnd/DVES_10E69E_fb/","GroupTopic":"cmnd/tasmotas/"}}
14:12:41.075 MQT: tele/sf-s55-01/INFO2 = {"Info2":{"WebServerMode":"Admin","Hostname":"sf-s55-01-1694","IPAddress":"192.168.2.136"}}
14:12:41.094 MQT: tele/sf-s55-01/INFO3 = {"Info3":{"RestartReason":{"Exception":29,"Reason":"Exception","EPC":["40253410","00000000","00000000"],"EXCVADDR":"00000000","DEPC":"00000000","CallChain":["402411b1","402414f1","40241576","40241650","4022975c","4022979e","402297c4","40229837","4022abcc","402344a1","40101b9f","40241a41","4024c730","40221138","4025013a","40250198","401000e1","40221170","4024c1d9","40228bf6","40101206","40228d0f","40251318","40228ec2","40101206","40228f15","4021ed66","4023db45","4023dc08","40100a7c","4024d380"]},"BootCount":88}}
14:12:41.136 MQT: stat/sf-s55-01/RESULT = {"POWER":"OFF"}
14:12:41.141 MQT: stat/sf-s55-01/POWER = OFF
14:12:44.478 QPC: Reset
14:12:45.393 MQT: tele/sf-s55-01/STATE = {"Time":"2022-07-24T14:12:45","Uptime":"0T00:00:09","UptimeSec":9,"Heap":28,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"POWER":"OFF","Wifi":{"AP":1,"SSId":"Shadow","BSSId":"48:8F:5A:21:1E:27","Channel":1,"Mode":"11n","RSSI":72,"Signal":-64,"LinkCount":1,"Downtime":"0T00:00:03"}}
14:12:46.386 APP: Boot Count 88
14:12:46.626 CFG: Saved to flash at F8, Count 122, Bytes 4096
14:12:50.400 MQT: tasmota/discovery/24A16010E69E/config = {"ip":"192.168.2.136","dn":"sf-s55-01","fn":["sf-s55-01",null,null,null,null,null,null,null],"hn":"sf-s55-01-1694","mac":"24A16010E69E","md":"Sonoff S55","ty":0,"if":0,"ofln":"Offline","onln":"Online","state":["OFF","ON","TOGGLE","HOLD"],"sw":"12.0.2.4","t":"sf-s55-01","ft":"%prefix%/%topic%/","tp":["cmnd","stat","tele"],"rl":[1,0,0,0,0,0,0,0],"swc":[-1,-1,-1,-1,-1,-1,-1,-1],"swn":[null,null,null,null,null,null,null,null],"btn":[0,0,0,0,0,0,0,0],"so":{"4":0,"11":0,"13":0,"17":0,"20":0,"30":0,"68":0,"73":0,"82":0,"114":0,"117":0},"lk":0,"lt_st":0,"sho":[0,0,0,0],"sht":[[0,0,0],[0,0,0],[0,0,0],[0,0,0]],"ver":1} (retained)
14:12:50.406 MQT: tasmota/discovery/24A16010E69E/sensors = {"sn":{"Time":"2022-07-24T14:12:50"},"ver":1} (retained)
14:12:59.781 WIF: Checking connection...
14:13:19.760 WIF: Checking connection...

TO REPRODUCE

Steps to reproduce the behavior:

Reboots every ~20 minutes

EXPECTED BEHAVIOUR

A clear and concise description of what you expected to happen.

Does not reboot without clear reason

SCREENSHOTS

If applicable, add screenshots to help explain your problem.

ADDITIONAL CONTEXT

Add any other context about the problem here.

Started few days ago and only in one location. in other location sonoff devices with same firmware work ok. Problem devices use different power lines some of them are UPSed. Tried to downgrade to 11.1 without clearing config but it did not help. The only this was changed at environment were DNS and DHCP server reconfiguration.

(Please, remember to close the issue when the problem has been addressed)

rayzilt commented 2 years ago

@arendst An update on your patch request.

Before trying your patch I wanted to have the same situation. Unfortunately I'm unable to regain the same situation by restoring the old settings that I had when all devices where running out of memory. I've had it running for a day now.

The setting that I restored are:

I'm quite puzzled. I'll let this faulty configuration running for a time being.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it hasn't any activity in last few weeks. It will be closed if no further activity occurs. Thank you for your contributions.

arendst commented 1 year ago

Keep active

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it hasn't any activity in last few weeks. It will be closed if no further activity occurs. Thank you for your contributions.

NielsPiersma commented 1 year ago

That is what I am seeing as well. Not all devices in my network are affected. Some work as expected. Niels


From: Silvan Raijer @.> Sent: Thursday, August 18, 2022 7:54:46 PM To: arendst/Tasmota @.> Cc: NielsPiersma @.>; Mention @.> Subject: Re: [arendst/Tasmota] Sonoff devices reboot every ~20 minutes with "Exception":29 caused by NTP multicast/manycast (Issue #16061)

@arendsthttps://github.com/arendst An update on your patch request.

Before trying your patch I wanted to have the same situation. Unfortunately I'm unable to regain the same situation by restoring the old settings that I had when all devices where running out of memory. I've had it running for a day now.

The setting that I restored are:

I'm quite puzzled. I'll let this faulty configuration running for a time being.

— Reply to this email directly, view it on GitHubhttps://github.com/arendst/Tasmota/issues/16061#issuecomment-1219773228, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQYDR52WKXHAXAR2O5IXIC3VZZ2GNANCNFSM54PQYJ3Q. You are receiving this because you were mentioned.Message ID: @.***>

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it hasn't any activity in last few weeks. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 year ago

This issue was automatically closed because of being stale. Feel free to open a new one if you still experience this problem.

kristoficko commented 7 months ago

Hello,

I have moved all my ESP devices from Tuya to Tasmota. At this moment I have 5 devices running and all of them are restarting every 20min due to the free memory being exhausted throwing Exception 29.

I was following this thread, couple weeks ago realized I had a container with NTP running (in a different subnet), disabled and removed the container and the devices stopped restarting.

Unfortunately after 10, maybe 14 days the issue re-appeared (after a power outage) and now I can't seem to figure out what the root cause is.

My wifi is provided by multiple Mikrotik APs.

Analyzing network traffic I can see the Tasmota devices are connecting to the NTP server specified in DHCP option 42...

Kindly requesting help.

Mike

NielsPiersma commented 7 months ago

Mike,

Just for eliminating the obvious, can you pls configure a static IP address and configure the ntp server address manually and retest.

If the issue persists it is not related to the DHCp issue that was resolved last year .

Kind regards Niels


From: kristoficko @.> Sent: Wednesday, January 10, 2024 10:42:15 AM To: arendst/Tasmota @.> Cc: NielsPiersma @.>; Mention @.> Subject: Re: [arendst/Tasmota] Sonoff devices reboot every ~20 minutes with "Exception":29 caused by NTP multicast/manycast (Issue #16061)

Hello,

I have moved all my ESP devices from Tuya to Tasmota. At this moment I have 5 devices running and all of them are restarting every 20min due to the free memory being exhausted throwing Exception 29.

I was following this thread, couple weeks ago realize I had a container with NTP running, disabled and removed the container and the devices stopped restarting.

Unfortunately after 10 days or so the issue re-appeared and I can't seem to figure out what the root cause is.

My wifi is provided by multiple Mikrotik APs.

Analyzing network traffic I can see the Tasmota devices are connecting to the NTP server specified in DHCP option 42...

Kindly requesting help.

Mike

— Reply to this email directly, view it on GitHubhttps://github.com/arendst/Tasmota/issues/16061#issuecomment-1884601127, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQYDR5Z7MCZIFIJFPILUDBTYNZWAPAVCNFSM54PQYJ32U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBYGQ3DAMJRGI3Q. You are receiving this because you were mentioned.Message ID: @.***>

kristoficko commented 7 months ago

Hi Niels,

many thanks for the suggestion, I did setup static address on all 5 devices. Although it's a bit early to tell, I can already see the following improvement:

  1. Heap is not decreasing constantly but rather oscillating between 21-23
  2. all devices have already passed an uptime longer than 20min

It doesn't seem to make a difference whether I setup my own internal NTP server or leave the 3 default ones.

One more observation, which may be relevant to this issue - there seems to be a massive difference in the frequency of NTP requests:

Kind regards

Mike

Jason2866 commented 7 months ago

DHCP option 42 for Timeserver is since a few major versions not used anymore in Tasmota. It introduced more issues than it solved. Many consumer routers do not support DHCP option 42. So the NTP Server has to be configured in Tasmota. When this is not done, Tasmota is not working correctly and can crash. Static IP address is not needed.

kristoficko commented 6 months ago

Hi Jason, I did a few tests, exporting the DHCP traffic from my Mikrotik router into wireshark and I beg to differ. This is what I captured:

  1. DHCP Discover from tasmota device contains in DHCP Option 55 (Parameter Request List) also DHCP Option 42 (Network Time Procotocol Servers)
  2. Mikrotik sends DHCP 42 Option in the DHCP Offer
  3. DHCP Request from tasmota again contains Option 42 inside option 55
  4. DHCP ACK from the router confirms previously offered NTP servers in Option 42
  5. Next Tasmota sends multiple requests to NTP servers - one to the server offered by my router in option 42, next to one of the public NTP servers that come preconfigured in the firmware.

This confirms that tasmota not only actively requests option 42 from the router, it also actively uses the NTP server offered from DHCP, 2024-01-29 15_55_25-tasmota_DHCP_NTP pcapng