Closed ashishpandey closed 6 months ago
Fortunately, I cannot confirm this bug.
I am using Z2M on two locations a pure docker/managed mode on two UZG01 on Ethernet (not WIFI) + USB power (no PoE).
I am using DHCP with a long renewal time (4h30min). Everything is running on electrical backup.
The UZG01 device uptime is 3 days / 5 days (I upgraded firmware 3 days ago) and before I had uptimes of a month. Z2M uptime is one day because I upgraded yesterday.
I think UGZ uptime is not the issue here. My UGZ stays up, but the connection between it and zigbee2mqtt drops for me at DHCP lease renew.
I see the Socket Uptime being capped at 1d in above screenshots, where is that limit coming from?
Hello ashishpandey, sorry for my late reply. Socket time is 1 day because I upgraded Z2M on both machines.
Docker shows an uptime of Z2M of several days.
I agree with you that I should look more carefully at logs, so I sysloged everything to a Syslog server. What should I look for? Normally, it should display "error" or something similar.
Here is my error log from the last 48 hours from Syslog: it only shows devices that devices cannot be pinged. This is because they are in a remote location and I did not install a secondary UZG01 in this place. But I don't see any sign of UZG01 failure.
Please note that the UZG01 does not run (yet on PoE) and I am using Zigbee firmware from original device. Versions are shown in my previous answer.
@ffries, how do you setup UZG to log to a syslog server? I can go back to DHCP mode and try to reproduce the logs, but I don't know how configure syslog
@xyzroe can you have a look at dhcp?
@xyzroe can you have a look at dhcp?
what do you want me to watch? you know how many hundreds of devices work all over the world, and there have never been problems with DHCP.
I think that @ashishpandey has some problems with his network. This problems make UZG to reconnect to the network, but not successful every time.
@xyzroe indeed, there is some problem with either the network or the UZG, I am just trying to identify what it is. I am not assuming it is a problem with UZG firmware, hardware or something external so far, but the problem exists for sure
If you look at the linked issue from zigbee2mqtt, multiple people have reported this, and narrowed it down to being worked around by switching to static IP on UZG. So we must all have something common going on. It is not isolated to my own network
A problem does not exist until identified / acknowledged / investigated. If it turns out to be something external to UZG, we will all learn something to mitigate it. If it turns out with UZG, the product can improve. Both are good for end users of UZG. How can we investigate and get to the root cause?
If it helps, I am running pfsense as DHCP. Also have hundreds of devices on the network, and don't see anything generally unsatisfactory with the network. But happy to investigate why my UZG becomes unavailable to zigbee2mqtt at DHCP lease renewal time. Some of the other posters in the zigbee2mqtt issue also use pfsense, it's a product used very widely itself
I've been using DHCP on Zigstar since release. I don't have any such problems.
but what’s even more surprising is that if I simulate problems with the network (for example, by disconnecting the switch in which Zigstar is connected), zigbee2mqtt try to reconnect again and it succeeds. the same thing if I reboot Zigstar while working.
This seems to in the territory of "works on my machine" type of problem. What can we do to help investigate what we are seeing? Are there any logs? @ffries mentioned syslog, I am curious if I can look at that?
zigbee2mqtt reconnect behaviour you mention is also interesting. What some of us have been seeing is zigbee2mqtt quits when adapter disconnects (the linked issue is essentially that). I am on zigbee2mqtt 1.35.0 (issue was reported at 1.34.0-1). Same happens if I restart UZG
I have switched back the UZG to DHCP for now, to capture more of restarts at zigbee2mqtt end where I can see some logs. It is reproducible every 2 hours for me (or whatever I set the DHCP lease time to). Unfortunately, I only see Adapter disconnected, stopping
in zigbee2mqtt logs
The system log he mentioned is just z2m, take a closer look at @ffries screenshot.
If you run z2m as an add on in HA, this is natural behavior. If you are using a clear z2m, your task will be to take care of restarting in case of a fall.
After receiving a new DHCP record, esp32 restarts the network interface. This is typical behavior for the libraries used in the project. Naturally, after restarting the network, all connections need to be re-established. In my case, like most users, this is done by the add on mechanism in HA, simply restarting z2m if it has stopped. So you can use a static address, or change the lifetime of the DHCP record to 10+ years, but the most correct way is to ensure that z2m is restarted in case of a crash. Because a socket connection to the adapter is used, sooner or later, but breaks happen. In your case, this will stop the entire Zigbee network.
If the DHCP renewal results in a new IP, then it makes sense to restart the network interface. If, however, the IP stays the same (which is true in all the reported cases (permanent/static reservation on the DHCP server), then the network interface should not restart. All my network based equipment keeps their network up and running even after countless of DHCP renewals. UZH is the only one that drops the connections.
Automatic restart of z2m in event of a crash is something all should have configured (either with the HA addon or some other way). However, a restart takes time and during that period the zigbee network will not function properly. In my opinion, this is not an acceptable solution to this issue.
Disabling DHCP in the UZG has turned out to be the only stable solution for me, but I still feel like DHCP should work better, i.e. renweals should not restart the network interface.
Just made some tests on my UZG-01. So what I found: socket connection doesn't drops while DHCP renew. First two screens were made just after DHCP update. Second two were made just after start of z2m.
0415/67b4e834-2f37-498a-9c07-7466d656cf1c)
I was wrong about how ZigStar made DHCP renew, it didn't drop connection. So I don't know how to reproduce your behavior. I'm using Mikrotik as DHCP server.
All users reporting this are using pfSense.
I will switch to a DHCP server outside of pfSense and test with that. Will report back my findings.
For what it is worth, I have been running a UGZ-1 with a static assignment via pfSense and a DHCP lease time of 2 hours for a few weeks now without any drops.
Which version of pfSense and DHCP daemon are you running? As 23.09 of pfSense+ added Kea DHCP an opt-in feature preview, which is what I am running on a Netgate SG-3100.
Hi, I have Mikrotik as dhcp serwer. Had a static IP configured via dhcp and experienced the stability issues. Now I configured the static IP on my UZG and looks like most of my problems are gone.
I had a lot of problems with motion sensors and light automation. Light wouldn't turn in on motion and turned on by themselves at night (super anti sleep therapy ...)
So basically Mikrotik is also causing the problem.
Are you saying that changing the IP address settings affected the stability and speed of the entire Zigbee network? I think you're wrong. something else changed while you changing your IP settings
I am having same issue with uzg-01 even with static ip setup. Today it broke twice.
On my end I am using mikrotik + netgear poe switch.
Any ideas how to enable remote syslog to see what's happening and in the end killing z2m?
What Zigbee chip do you have? P7 has another another problem with the same behavior.
@xyzroe I believe it's P7 since my device was ordered 2 weeks ago.
@fliespl Please full list of your ZigBee device in the network. Seems is a issue with P7 firmware, will be resolved soon by Koenkk
@mercenaruss will that help, or do you need something else?
Also... Updated to UZG-01 to version 0.2.0 two days ago and I didn't have to restart it yet... Will let you know if it does it again.
Total 54
By device type
End devices: 35
Router: 19
By power source
Battery: 35
Mains (single phase): 18
DC Source: 1
By vendor
LUMI: 11
IKEA of Sweden: 10
Danfoss: 4
_TZ3000_dowj6gyi: 3
_TZ3000_gvn91tmx: 3
HEIMAN: 2
_TZE200_81isopgh: 2
_TZ3000_mrpevh8p: 2
_TZE200_znbl8dj5: 1
_TZ3000_xabckq1v: 1
_TZ3000_gjnozsaz: 1
_TZ3000_mg4dy6z6: 1
_TZE204_t1blo2bj: 1
_TZE204_ztc6ggyl: 1
_TZ3000_ja5osu5g: 1
_TZE200_ga1maeof: 1
_TZE204_k7mfgaen: 1
_TZE204_sooucan5: 1
_TZE200_hl0ss9oa: 1
_TZE204_sbyx0lm6: 1
_TZ3000_fa9mlvja: 1
Danfoss: 1
_TZE200_9yapgbuv: 1
_TZ3000_saiqcn0y: 1
_TZ3000_bguser20: 1
By model
TS0601: 10
TS0201: 5
TS011F: 4
lumi.magnet.acn001: 4
eTRV0103: 4
TRADFRIbulbGU10WS345lm: 3
TS0041: 3
TRADFRIbulbE27WSglobeclear806lm: 2
lumi.sensor_wleak.aq1: 2
TS004F: 2
lumi.sensor_magnet.aq2: 2
SmokeSensor-EM: 2
Remote Control N2: 1
TRADFRI Driver 30W: 1
lumi.plug.maeu01: 1
TRADFRI motion sensor: 1
lumi.sensor_cube.aqgl01: 1
TS0202: 1
TRADFRI bulb E27 CWS 806lm: 1
lumi.magnet.ac01: 1
STARKVIND Air purifier: 1
TS0225: 1
TRV003: 1
I have swithched to another DHCP server, and the problem seems to be fixed. There might be a problem with ISC DHCP (which was shipped with pfSense earlier). ISC DHCP has reached end-of-life and is replaced with Kea DHCP in pfSense.
Migrating to Kea DHCP is quite easy (push of a button) in pfSense, so I reccomend doing that if you experience problems with DHCP.
@mercenaruss do you know if P7 problem was resolved? I did have a connection break like 3 times this week (with static ip).
It seems UGZ-01 drops client connections while renewing DHCP leases
This causes disruption to clients like zigbee2mqtt, see related issue here where some people have reproduced this reliably: https://github.com/Koenkk/zigbee2mqtt/issues/20148
Switching UGZ-01 to static IP solves the issue further confirming the diagnosis