Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔹
https://www.zigbee2mqtt.io
GNU General Public License v3.0
12.11k stars 1.68k forks source link

Zigbee2MQTT crash several times a day #15755

Closed jlpower68 closed 1 year ago

jlpower68 commented 1 year ago

What happened?

ZigbeeToMQTT crashes randomly 1 to 10 times a day. After restarting the docker container, it works fine again for few hours.

I've attached the last log file, where last crashes appeared at : 2022-12-26 02:09:41 2022-12-26 06:44:55

Txs

What did you expect to happen?

No response

How to reproduce it (minimal and precise)

No response

Zigbee2MQTT version

1.28.4 commit: 52e545f (also tested with 1.28.4-dev commit: 05cb140)

Adapter firmware version

6.10.3.0 build 297

Adapter

Sonoff Zigbee 3.0 USB Dongle Plus V2 model "ZBDongle-E"

Debug log

log.txt

t112013 commented 1 year ago

I have the same issue. I regret so much i moved to this dondle :( No remotes work, IKEA remote drain battery,

tjarmann commented 1 year ago

Same problem here, seems to be related to the ZBDongle-E. I had a Conbee II and used groups and touchlink with remotes, but after changing to ZBD-E this does not work.

itchannel commented 1 year ago

I also get the same issue since moving to the ZBDongle-E. I've tried both the docker version and the linux version master & Dev. But both result in random crashes 2-3 times a day. I've attached a full log so you can see where it's crashing, it's neary 24hrs worth but the crash appears when I start seeing the following errors:

Error while parsing to ZpiObject 'RangeError: Attempt to access memory outside buffer bounds

Shortly followed up by

  zigbee-herdsman:adapter:ezsp:uart Opening SerialPort with /dev/serial/by-id/usb-ITEAD_SONOFF_Zigbee_3.0_USB_Dongle_Plus_V2_20220714103031-if00 and {"baudRate":115200,"rtscts":false,"autoOpen":false} +6s
  zigbee-herdsman:adapter:ezsp:erro Reset error Error: Error while opening serialport 'Error: Error Resource temporarily unavailable Cannot lock port'
  zigbee-herdsman:adapter:ezsp:erro     at SerialPort.<anonymous> (/opt/zigbee2mqtt/node_modules/zigbee-herdsman/src/adapter/ezsp/driver/uart.ts:90:28)
  zigbee-herdsman:adapter:ezsp:erro     at SerialPort._error (/opt/zigbee2mqtt/node_modules/@serialport/stream/lib/index.js:198:14)
  zigbee-herdsman:adapter:ezsp:erro     at /opt/zigbee2mqtt/node_modules/@serialport/stream/lib/index.js:242:12 +6s
  zigbee-herdsman:adapter:ezsp:driv Pause 60sec before try 141 +1s
  zigbee-herdsman:adapter:ezsp:driv Reset connection. Try 141 +54s
  zigbee-herdsman:adapter:ezsp:driv Stop driver +1ms
  zigbee-herdsman:adapter:ezsp:ezsp Stop ezsp +55s
  zigbee-herdsman:adapter:ezsp:ezsp Close ezsp +0ms
  zigbee-herdsman:adapter:ezsp:driv Close driver +0ms

This is on a fresh Raspbian build with nothing else using the adapter.

herdlog.txt

Any help is greatly appreciated thanks.

panhans commented 1 year ago

Same issue here. For me I see some devices leave network on disconnect, too. Have an automation if a divice is unknown z2m gets restarted.

t112013 commented 1 year ago

Same issue here. For me I see some devices leave network on disconnect, too. Have an automation if a divice is unknown z2m gets restarted.

Oh could you share your automatation please ?

jlpower68 commented 1 year ago

Fyi, I have replaced my ZBDongle-E by a ZBDongle-P yesterday evening (thanks Amazon and the great return policy :-), and it still works this morning without any auto restart of the docker container. Of course, it is not so much intelectually satisfaying than fixing the problem with this experimentally supported ZBDongle-E, but sometimes, invest 30 euros and save a lot of hours is wise :-).

Koenkk commented 1 year ago

Switching to the ZBDongle-P dongle should indeed fix these issues. As states in the docs (https://www.zigbee2mqtt.io/guide/adapters/) support for the ZBDongle-E is still experimental

t112013 commented 1 year ago

Is it anyhow a way to not lose the pairing ? or do i have to pair all the 60 devices i have ? What is to loose ? Zigbee 3.0 support ? why is even the E model better than the P ? What other dongle could i use ? what is the "best" recommended dongle ?

jlpower68 commented 1 year ago

Here is the list of recommended adapters: https://www.zigbee2mqtt.io/guide/adapters/ It also explain the difference between the 2 families, due to the kind of chipset. Initially I've chosen EFRxxxx because their are supposed to be opened to Matter migration. At the beginning, there is an explanation on how to migrate radio for 1 adapter to other, to skip redo pairing. I don't know if it works when you change of chipset. I have less than 10 devices, so I repaired them.

Le sam. 31 déc. 2022, 18:20, t112013 @.***> a écrit :

Is it anyhow a way to not lose the pairing ? or do i have to pair all the 60 devices i have ? What is to loose ? Zigbee 3.0 support ? why is even the E model better than the P ? What other dongle could i use ? what is the "best" recommended dongle ?

— Reply to this email directly, view it on GitHub https://github.com/Koenkk/zigbee2mqtt/issues/15755#issuecomment-1368255008, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARYNZ5NFCKP7K74QIQWNLP3WQBTPVANCNFSM6AAAAAATJRQV7I . You are receiving this because you authored the thread.Message ID: @.***>

andy-81 commented 1 year ago

I have had this exact issue, so purchased the zbdongle-p version to solve the problem.

on trying to switch I managed to mess up the setup and ended up having to recover to the backup I made of Z2m and put my old dongle back in temporarily to get back to a working state. I say temporarily but have since been running with it for the last couple of days, for some reason after repairing my devices everything seems stable again. Not sure what has changed but I haven’t had the system stop responding on me since Friday when I restored everything to my backup and reconfigured everything.

If it helps solve anything I am running Z2m in HA supervisor (hence the issue with downgrading to the earlier version) and it all runs on an Rpi4 so I made a backup of the USB drive before starting the switch to the new dongle. After backing up the drive I tried to switch to the new/older zbdongle-p which showed up, however I had issues getting it to pair with any of the devices, it would enter pairing but then the front end never showed it connecting to the devices. So I had a go at using ZHA in HA which also failed even after resetting the Dongle. I did the same with the original zbdongle-e which reset the older dongle to factory settings (hence why I had to repair after restoring). I am guessing this is what may have solved the issue for me, after restoring the HA from the backup I had made beforehand, I decided to get back to a good working state and so put the original zbdongle-e into the setup and repaired the devices to make sure it was working, since then it has been stable. My issues started happening after Z2M upgraded from 1.27 to 1.28.

not useful to those with a lot of zigbee devices but hopefully it may point someone to where the issue is, to me it seems like something to do with zbdongle-e setup under an older version of Z2m before upgrading and maybe it puts something out of line. I’m not as technical as I used to be but hopefully that makes sense to someone.

For now, I am sticking with the e version of the dongle after my failed attempt to switch to the p version, maybe I will look to switch again of my setup goes unstable again.

panhans commented 1 year ago

The latest version (1.29.0-1) seems to run stable with the dongle-e. The issue didn't appear last two days since I've updated. I think I will stick with that version until the newer dongle leave experimental state. I don't want to downgrade the hardware. The only option is the skyConnect dongle but that has the same issues with z2m. Just coming to z2m because of the bad trv support of zha.

xm4rcell0x commented 1 year ago

The latest version (1.29.0-1) seems to run stable with the dongle-e. The issue didn't appear last two days since I've updated. I think I will stick with that version until the newer dongle leave experimental state. I don't want to downgrade the hardware. The only option is the skyConnect dongle but that has the same issues with z2m. Just coming to z2m because of the bad trv support of zha.

Any news? Still working?

andy-81 commented 1 year ago

The latest version (1.29.0-1) seems to run stable with the dongle-e. The issue didn't appear last two days since I've updated. I think I will stick with that version until the newer dongle leave experimental state. I don't want to downgrade the hardware. The only option is the skyConnect dongle but that has the same issues with z2m. Just coming to z2m because of the bad trv support of zha.

Any news? Still working?

My crashes have come back, not as often but they do occur.

I have created a sensor to detect the Z2M going offline and restart the server for me. It isn't ideal but it at least means my network stays running.

I am sure there is a better way to do it, but I have set up a simple sensor which looks at the last_seen entity for one of my switches I always have on (collecting the power usage of our freezer) and if it hasn't checked in for 90 seconds I have a script which then restarts Z2M for me.

t112013 commented 1 year ago

The latest version (1.29.0-1) seems to run stable with the dongle-e. The issue didn't appear last two days since I've updated. I think I will stick with that version until the newer dongle leave experimental state. I don't want to downgrade the hardware. The only option is the skyConnect dongle but that has the same issues with z2m. Just coming to z2m because of the bad trv support of zha.

Any news? Still working?

My crashes have come back, not as often but they do occur.

I have created a sensor to detect the Z2M going offline and restart the server for me. It isn't ideal but it at least means my network stays running.

I am sure there is a better way to do it, but I have set up a simple sensor which looks at the last_seen entity for one of my switches I always have on (collecting the power usage of our freezer) and if it hasn't checked in for 90 seconds I have a script which then restarts Z2M for me.

Hello. could you share how do you made the sensor ? and eventually how yout automation looks like ? I have the same issue. on the last update it was fine for a few days and it crashed yesterday, for now its been a few times only.

andy-81 commented 1 year ago

The latest version (1.29.0-1) seems to run stable with the dongle-e. The issue didn't appear last two days since I've updated. I think I will stick with that version until the newer dongle leave experimental state. I don't want to downgrade the hardware. The only option is the skyConnect dongle but that has the same issues with z2m. Just coming to z2m because of the bad trv support of zha.

Any news? Still working?

My crashes have come back, not as often but they do occur. I have created a sensor to detect the Z2M going offline and restart the server for me. It isn't ideal but it at least means my network stays running. I am sure there is a better way to do it, but I have set up a simple sensor which looks at the last_seen entity for one of my switches I always have on (collecting the power usage of our freezer) and if it hasn't checked in for 90 seconds I have a script which then restarts Z2M for me.

Hello. could you share how do you made the sensor ? and eventually how yout automation looks like ? I have the same issue. on the last update it was fine for a few days and it crashed yesterday, for now its been a few times only.

My sensor in the config looks like this: binary_sensor:

if you already have a binary_sensor section then place the code under it and remove the top line.

this will enable an on/off switch in HA under entities you can use in the automation. I found someone else doing similar but couldn’t get their way to work so ended up doing it this way. If you have a sensor which is always online and reports in at least once every minute then use that device, in my case it is a power socket which I use for metering my freezer (hence the name)

I also went into the device in HA and enabled the last seen option, if you don’t have the option you’ll need to enable it in Z2M as well, if you struggle ping me back and I’ll put a full documentation together.

This is what the sensor looks like if working correctly 17514DC7-405C-4208-B7D6-C7A40FE245D0

Once you have that sensor set up the automation was pretty straight forward. I created it in the GUI but you can also use the below yaml code which is a copy of what mine created: alias: Z2M monitor description: "" trigger:

The screen shots for the GUI are: 8780043F-2EA1-40FF-B67F-6EE9B12A4DC3

8746249C-9912-4DA4-8FF9-E7041719832D

as you can see I also have an alert which goes to my iPhone (name is based on Cockney rhyming slang for phone) so I know the automation ran.

hope that helps, I am sure there are people on here who could come up with a better way of doing it as I am not a programmer but I found the above worked for me and has kept my network running with minimal downtime.

It is worth remembering to turn the automation off when you upgrade Z2m, it doesn’t break it but will cause it to restart a second time after the update.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days