Koenkk / Z-Stack-firmware

Compilation instructions and hex files for Z-Stack firmwares
MIT License
2.38k stars 648 forks source link

Z-Stack_3.x.0 coordinator 20230507 feedback #439

Closed Koenkk closed 1 year ago

Koenkk commented 1 year ago

Please provide your feedback to the Z-Stack_3.x.0 coordinator 20230507 firmware here.

I hope this solves the NWK_TABLE_FULL many people were experiencing.

Changelog

20230507

  • Enable child aging to fix issues like #13478 (but not for older Xiaomi devices as they do not implement child aging correctly which gets them kicked out of the network)
  • Increase message timeout from 7 to 8 seconds to increase message delivery success rate for devices using a 7.5 seconds poll interval (#13478)
  • Improve performance with larger network
    • Optimize table sizes
    • Increase stack_size from 1024 to 8192
  • Add firmware for CC1352P7
  • SimpleLink SDK 7.10.00.98

Download

johnlento commented 1 year ago

So I finally bit the bullet and went to 0507 and started to repair all devices. A lot of them fall offline quick well faster than the 8 hour timeout. Also, it seems like my home assistant instance is crashing. I’m not sure if it’s because of the firmware or what but a lot of timeouts and watch dog failures. Anyone have this? 0411 was stable for a good week and half at a time but 0507 just seems to wreck me.

Sent via RFC 1149 Get some Carrier Pigeonshttps://en.m.wikipedia.org/wiki/IP_over_Avian_Carriers...


From: harryfine @.> Sent: Friday, June 23, 2023 12:21:56 PM To: Koenkk/Z-Stack-firmware @.> Cc: johnlento @.>; Mention @.> Subject: Re: [Koenkk/Z-Stack-firmware] Z-Stack_3.x.0 coordinator 20230507 feedback (Issue #439)

I had the same problem, it seems that with this version, the new version that the flasher program doesn't work, the developer has posted some instructions to do it using another method. Once you learn the other method, it's even easier because you don't have to take the dongle apart to put it into boot mode. You have to make sure Python is installed and some other stuff and then it's a simple command line that updates it.

Harry Fine

647-970-6378

On Jun 23, 2023, 11:09, at 11:09, azsystem @.***> wrote:

How did you perform the update? I have tried it using FlashProgrammer 2 and it failed

-- Reply to this email directly or view it on GitHub: https://github.com/Koenkk/Z-Stack-firmware/issues/439#issuecomment-1604421432 You are receiving this because you commented.

Message ID: @.***>

— Reply to this email directly, view it on GitHubhttps://github.com/Koenkk/Z-Stack-firmware/issues/439#issuecomment-1604512357, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGHSRT3AZQJX2PU2R5BMJSDXMW7CJANCNFSM6AAAAAAWPPN23Q. You are receiving this because you were mentioned.Message ID: @.***>

aurimasniekis commented 1 year ago

Upgraded to this firmware, as I thought the TubeZB poe module had latest as it had paper stating latest firmware flashed and i ordered week ago, but before I had version 20221226 had a lot of issues with NWK_TABLE_FULL after upgrade it seams better and more responsive, but still getting NWK_TABLE_FULL errors:

My setup is this:

Zigbee2MQTT version: [1.31.2](https://github.com/Koenkk/zigbee2mqtt/releases/tag/1.31.2) commit: [unknown](https://github.com/Koenkk/zigbee2mqtt/commit/unknown) Coordinator type: zStack3x0 Coordinator revision: 20230507 Coordinator IEEE Address: 0x00124b002a2e354a Frontend version: 0.6.129 ## Stats **Total**: 97 ### By device type **Router**: 97 ### By power source **Mains (single phase)**: 97 ### By vendor **IKEA of Sweden**: 77 **_TZE200_holel4dk**: 11 **_TZE200_jva8ink8**: 8 **SONOFF**: 1 ### By model **TRADFRIbulbE27WSglobeclear806lm**: 56 **TRADFRI bulb E14 WS globe 470lm**: 20 **TS0601**: 19 **TRADFRI bulb E27 WW 806lm**: 1 **DONGLE-E_R**: 1

After update I am still getting these errors:

{"level":"error","message":"Publish 'set' 'brightness' to 'lamp22' failed: 'Error: Command 0x003c84fffe17004c/1 genLevelCtrl.moveToLevelWithOnOff({\"level\":3,\"transtime\":30}, {\"sendWhen\":\"immediate\",\"timeout\":10000,\"disableResponse\":false,\"disableRecovery\":false,\"disableDefaultResponse\":false,\"direction\":0,\"srcEndpoint\":null,\"reservedBits\":0,\"manufacturerCode\":null,\"transactionSequenceNumber\":null,\"writeUndiv\":false}) failed (SREQ '--> ZDO - extRouteDisc - {\"dstAddr\":11894,\"options\":0,\"radius\":30}' failed with status '(0xc7: NWK_TABLE_FULL)' (expected '(0x00: SUCCESS)'))'"}
{"level":"error","message":"Publish 'set' 'brightness' to 'lamp74' failed: 'Error: Command 0xdc8e95fffe4a1d64/1 genLevelCtrl.moveToLevelWithOnOff({\"level\":3,\"transtime\":30}, {\"sendWhen\":\"immediate\",\"timeout\":10000,\"disableResponse\":false,\"disableRecovery\":false,\"disableDefaultResponse\":false,\"direction\":0,\"srcEndpoint\":null,\"reservedBits\":0,\"manufacturerCode\":null,\"transactionSequenceNumber\":null,\"writeUndiv\":false}) failed (SREQ '--> ZDO - extRouteDisc - {\"dstAddr\":42711,\"options\":0,\"radius\":30}' failed with status '(0xc7: NWK_TABLE_FULL)' (expected '(0x00: SUCCESS)'))'"}
{"level":"error","message":"Publish 'set' 'brightness' to 'lamp84' failed: 'Error: Command 0x003c84fffe2e0e03/1 genLevelCtrl.moveToLevelWithOnOff({\"level\":254,\"transtime\":15}, {\"sendWhen\":\"immediate\",\"timeout\":10000,\"disableResponse\":false,\"disableRecovery\":false,\"disableDefaultResponse\":false,\"direction\":0,\"srcEndpoint\":null,\"reservedBits\":0,\"manufacturerCode\":null,\"transactionSequenceNumber\":null,\"writeUndiv\":false}) failed (Timeout - 40674 - 1 - 134 - 8 - 11 after 10000ms)'"}
{"level":"error","message":"Publish 'set' 'brightness' to 'lamp72' failed: 'Error: Command 0x84b4dbfffeec28ea/1 genLevelCtrl.moveToLevelWithOnOff({\"level\":254,\"transtime\":15}, {\"sendWhen\":\"immediate\",\"timeout\":10000,\"disableResponse\":false,\"disableRecovery\":false,\"disableDefaultResponse\":false,\"direction\":0,\"srcEndpoint\":null,\"reservedBits\":0,\"manufacturerCode\":null,\"transactionSequenceNumber\":null,\"writeUndiv\":false}) failed (SREQ '--> ZDO - extRouteDisc - {\"dstAddr\":19527,\"options\":0,\"radius\":30}' failed with status '(0xc7: NWK_TABLE_FULL)' (expected '(0x00: SUCCESS)'))'"}
{"level":"error","message":"Publish 'set' 'brightness' to 'lamp10' failed: 'Error: Command 0x84b4dbfffeebfae0/1 genLevelCtrl.moveToLevelWithOnOff({\"level\":254,\"transtime\":0}, {\"sendWhen\":\"immediate\",\"timeout\":10000,\"disableResponse\":false,\"disableRecovery\":false,\"disableDefaultResponse\":false,\"direction\":0,\"srcEndpoint\":null,\"reservedBits\":0,\"manufacturerCode\":null,\"transactionSequenceNumber\":null,\"writeUndiv\":false}) failed (SREQ '--> ZDO - extRouteDisc - {\"dstAddr\":38827,\"options\":0,\"radius\":30}' failed with status '(0xc7: NWK_TABLE_FULL)' (expected '(0x00: SUCCESS)'))'"}

I still need to add like 50 more devices, and I am scared that they will not work

epower53 commented 1 year ago

I've been running 0507 for a little over a month now, and have just recently started to see my entire network drop off of HA all at once. My setup is a little different and may help @Koenkk to figure out the source of the dropout issues. I have z2m, mosquito, and Node-Red running on one box, and HA runs on a separate rPi. HA uses the MQTT integration to pull in devices from my mosquito instance.

Every so often (it's happened 3 times in the past month, and never happened before I upgraded to 0507 that I recall) all my zigbee devices suddenly show unavailable in HA. The are all still online in reality... if I go to the z2m web interface on the other box I can see and control the devices perfectly, and all my Node-Red automations continue to work. It's just HA that's lost its connection. 2 of the 3 instances occurred after an HA core or HAOS update - after reboot HA reported all zigbee devices as unavailable. The solution was to restart z2m, after which HA showed everything online. Seems like there's some odd interplay between HA, MQTT, z2m, and the 0507 firmware that's causing this. Everything is up-to-date with the latest release. I would postulate that people who have posted mass dropouts of devices are actually seeing the same issue I've seen, but it's harder for them to differentiate because z2m is running as an HA addon and they don't have as much familiarity with the z2m web interface as a direct-access tool. If they open the interface when HA reports a mass dropout, I'd wager z2m still has control of the network and it's HA that's not reading the keepalive info correctly from MQTT.

aurimasniekis commented 1 year ago

24h later I am having the coordinator freeze up witch requires restart several times, only timeout from z2m error. The NWK_FULL_TABLE error still present the same amount as before update

dlasher commented 1 year ago

Been running 0507 for a few weeks, first lockup yesterday. power-cycle, and we're back. Had been stable up to that point.

I am seeing a few route messages, but it seems back alive for now.

warn  2023-06-30 09:07:01: Failed to ping 'Pantry.Light' (attempt 1/1, Read 0x000d6ffffefc7260/1 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'No network route' (205)))
warn  2023-06-30 09:07:08: Failed to ping 'EW.Dad.Office.1' (attempt 1/1, Read 0x000d6f00036af3b0/1 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'MAC no ack' (233)))
warn  2023-06-30 09:07:29: Failed to ping 'Enerwave.DaughtersRoom' (attempt 1/1, Read 0x000d6f00034b1053/1 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'MAC no ack' (233)))
warn  2023-06-30 09:07:56: Failed to ping 'EW.Dad.Office.1' (attempt 1/1, Read 0x000d6f00036af3b0/1 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'No network route' (205)))

The specifics

Zigbee2MQTT version
[1.31.2](https://github.com/Koenkk/zigbee2mqtt/releases/tag/1.31.2) commit: [21f5125](https://github.com/Koenkk/zigbee2mqtt/commit/21f5125)
Coordinator type: zStack3x0
Coordinator revision: 20230507
Coordinator IEEE Address: <redacted>
Frontend version: 0.6.129
Stats
Total: 72
End devices: 47
Router: 25
pdecat commented 1 year ago

Is it possible the firmware downgrade of my ZZH! adapter from CC2652R_coordinator_20230507.hex back to CC2652R_coordinator_20221226.hex was not complete?

Or that data in my ZZH! adapter is corrupt since the upgrade? Version shown is CC1352/CC2652, Z-Stack 3.30+ (build 20221226)

FWIW, i'm tracing all events in https://github.com/Koenkk/Z-Stack-firmware/issues/439#issuecomment-1591982856 since the first issue.

adampetrovic commented 1 year ago

Running tubeszb-cc2652-poe-2022 Zigbee2MQTT version 1.31.2 commit: 21f5125 Coordinator revision 20230507

Total 102 devices, 2 dedicated routers running slightly outdated firmware (20221102 and 20210128)

Still seeing this in my logs:

2023-07-01 18:10:01Failed to read state of 'Back Bedroom 1' after reconnect 
(Read 0x00158d0004775c1c/1 genOnOff(["onOff"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) 
failed (SREQ '--> ZDO - extRouteDisc - {"dstAddr":65410,"options":0,"radius":30}' 
failed with status '(0xc7: NWK_TABLE_FULL)' (expected '(0x00: SUCCESS)')))

lots of timeouts trying to re-pair devices:

Failed to read state of 'Garage 4' after reconnect (Read 0x00158d0004775b49/1 
genOnOff(["onOff"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) 
failed (Timeout - 25509 - 1 - 113 - 6 - 1 after 10000ms))
klada commented 1 year ago

Upgraded from 20220219. No issues so far, running for a week. :+1:

Koenkk commented 1 year ago

For those having NWK_TABLE_FULL or NETWORK_NO_ROUTE, do you have any AwoX devices in your network? (open your data/database.db and search for AwoX, if yes, try removing them from the network.

aurimasniekis commented 1 year ago

No, mostly ikea light bulbs and one manufacturer presence sensor

adampetrovic commented 1 year ago
$ grep -i awox database.db | wc -l
0
fsevilla3 commented 1 year ago

For those having NWK_TABLE_FULL or NETWORK_NO_ROUTE, do you have any AwoX devices in your network? (open your data/database.db and search for AwoX, if yes, try removing them from the network.

I don't have any AwoX devices, either:

$ grep -i awox database.db  | wc -l
0
FlyingDomotic commented 1 year ago

Same here...

Le 02/07/2023 à 11:39, Federico Sevilla a écrit :

$ grep -i awox database.db | wc -l

Koenkk commented 1 year ago
fsevilla3 commented 1 year ago
  • What stack are you using? (HA/Z2M)

Z2M 1.31.2 commit: 21f5125

  • How many devices in your network?

35 devices

  • What was the latest known working firmware version?

20221226

KoKolaj commented 1 year ago

Today I tried to install the 20230507 version on USB Dongle-P. The device again non -functional - there is a timeout. After returning to version 20220219, it is fine.

Thank you for your work.

adampetrovic commented 1 year ago
  • What stack are you using? (HA/Z2M)
  • How many devices in your network?

102 devices

  • What was the latest known working firmware version?

It did happen for me on 20220219 but far less often

johnlento commented 1 year ago

Does anyone else’s dongle just crash? I get timeouts and watchdog failures all the time. I just have no idea how to troubleshoot. 156 devices, home assistant on a raspberry pi, and zbdongle-p on 0507. I also have two other dongles on the 1229 router firmware.

Sent via RFC 1149 Get some Carrier Pigeonshttps://en.m.wikipedia.org/wiki/IP_over_Avian_Carriers...


From: Adam Petrovic @.> Sent: Tuesday, July 4, 2023 5:04:26 PM To: Koenkk/Z-Stack-firmware @.> Cc: johnlento @.>; Mention @.> Subject: Re: [Koenkk/Z-Stack-firmware] Z-Stack_3.x.0 coordinator 20230507 feedback (Issue #439)

102 devices

It did happen for me on 20220219 but far less often

— Reply to this email directly, view it on GitHubhttps://github.com/Koenkk/Z-Stack-firmware/issues/439#issuecomment-1620741193, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGHSRTYTPDGYJFKROKOABQLXOSANVANCNFSM6AAAAAAWPPN23Q. You are receiving this because you were mentioned.Message ID: @.***>

FlyingDomotic commented 1 year ago

Did you try to disconnect the stick when something goes wrong, or when upgrading firmware? I do it systematically and fixed some strange behaviors this way.

To come back to 20230507, I still have network table full errors, and significantly more "No network route" than with 20221226. Note that amazingly, these errors are often with routers.

Z2M 1.30.4, 117 devices, including 11 routers, slaesh stick.

sjorge commented 1 year ago

Happened a few times now where part of the mesh becomes unavailable but not really anything of value is logged except:

error 2023-07-05 18:11:21: Publish 'set' 'color_temp' to 'adaptive_ligthing' failed: 'Error: Command 10100 lightingColorCtrl.moveToColorTemp({"colortemp":346,"transtime":30}) failed (SRSP - AF - dataRequestExt after 6000ms)'
debug 2023-07-05 18:11:21: Error: Command 10100 lightingColorCtrl.moveToColorTemp({"colortemp":346,"transtime":30}) failed (SRSP - AF - dataRequestExt after 6000ms)
error 2023-07-05 18:16:21: Publish 'set' 'color_temp' to 'adaptive_ligthing' failed: 'Error: Command 10100 lightingColorCtrl.moveToColorTemp({"colortemp":348,"transtime":30}) failed (SRSP - AF - dataRequestExt after 6000ms)'
debug 2023-07-05 18:16:21: Error: Command 10100 lightingColorCtrl.moveToColorTemp({"colortemp":348,"transtime":30}) failed (SRSP - AF - dataRequestExt after 6000ms)
error 2023-07-05 18:21:21: Publish 'set' 'color_temp' to 'adaptive_ligthing' failed: 'Error: Command 10100 lightingColorCtrl.moveToColorTemp({"colortemp":351,"transtime":30}) failed (SRSP - AF - dataRequestExt after 6000ms)'
debug 2023-07-05 18:21:21: Error: Command 10100 lightingColorCtrl.moveToColorTemp({"colortemp":351,"transtime":30}) failed (SRSP - AF - dataRequestExt after 6000ms)

Not seen those dataRequestExt before. I wiped and reflashed with the same version to see if it helps, it was fine for weeks before.

The only thing that fixes this is to unplug wait and replug the coordinator. Other things like remotes can still talk to all devices that appear offline for z2m.

MattWestb commented 1 year ago

@sjorge I think the network have broken routing then the firmware have patched out of standard for broadcast and the routing diecovery is made with broadcast that is being blocked in the IEEE 802.15.4 network layer if getting more then 8 broadcast in 9 seconds in all zigbee routers. See closed issue https://github.com/Koenkk/Z-Stack-firmware/issues/443

Is unicast working for routers that is in the first level / direct from the coordinator ?

sjorge commented 1 year ago

@sjorge I think the network have broken routing then the firmware have patched out of standard for broadcast and the routing diecovery is made with broadcast that is being blocked in the IEEE 802.15.4 network layer if getting more then 8 broadcast in 9 seconds in all zigbee routers.

See closed issue https://github.com/Koenkk/Z-Stack-firmware/issues/443

Is unicast working for routers that is in the first level / direct from the coordinator ?

Not much is working and over time it loses connections to more devices.

But that theory is not totally crazy as there is usually one or two messages to a group before. Mostly eliminated groups except for 6 at the moment. They are rather big though. (Came from like 20)

sjorge commented 1 year ago

Going on that theory, i winder if the new queuing system in zh might make things worse with groups as it will try more often now i think... or are groups exempt from it.

I wonder if we can somehow reserve a few of those broadcast slots just for route discovery

MattWestb commented 1 year ago

Is routers that have direct connection also stop working for broadcast ? Then its one broadcast storm blocking in the 15.4 network. Also its 2 types of broadcast acking the old = sending the broadcast 3 times (its only counting as one address slot) and the new (passive acking) sending 1 time and listening for the neighbors is relaying it and if all is doing it = OK and if not reseeding.

Also you can sniffing and see if the broadcast is working in your network or where is blocked. Very likely you i getting unicast problems with the same devices that the broadcast is not working then the routing discovery is not working and is timing out the routers.

MattWestb commented 1 year ago

By the way some ZHA user with large networks and EZSP is disabling sours routing and getting the network more stable and i think its the router discovery that is getting problems and the 15.4 is blocking broadcasts.

sjorge commented 1 year ago

If it happens when i am not working i can probably take a sniff next time.

MattWestb commented 1 year ago

Some interesting reading how dence mesh network is working and the performance but no dilates of routing and broadcast in it. https://www.silabs.com/documents/login/application-notes/an1138-zigbee-mesh-network-performance.pdf

MattWestb commented 1 year ago

You can also simulating it by making one group of 1 or more lights and fuddling it with commands. and between sending some unicasts and see if routing and routing discovery is working all the time.

dlasher commented 1 year ago

Starting to see lockup/crashes every few days.. is there a new release candidate to test?

deviantintegral commented 1 year ago

I just experienced another crash. I had to pull the stick to reset it. There's no obvious errors in the logs other than timeouts. There was two seconds of quiet logs before this so I don't think any earlier logs are relevant. Let me know if there's anything more I should look for or post.

https://gist.github.com/deviantintegral/60a2fbaa66527ceb460d87678f74a372

What stack are you using? (HA/Z2M)

zigbee2mqtt 1.32.0

How many devices in your network?

101 devices with 60 end devices and 41 routers.

What was the latest known working firmware version?

Good question. I had other more frequent issues with prior firmwares, but I don't think I had complete crashes with the 20220219 or 20221226 releases.

adampetrovic commented 1 year ago

Now starting to see MAC Channel access failure errors.

Zigbee2MQTT:error 2023-07-08 12:50:00: Publish 'set' 'state' to '0x00158d0004768fa1' failed: 'Error: Command 0x00158d0004768fa1/1 
genOnOff.on({}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed 
(Data request failed with error: 'MAC channel access failure' (225))'
aurimasniekis commented 1 year ago

This is my related issue on z2m with almost same errors everyone talked in last comments, I have received too https://github.com/Koenkk/zigbee2mqtt/issues/18196

Koenkk commented 1 year ago

MAC channel access failure' (225) means there is interference, this cannot be solved from the firmware. See https://www.zigbee2mqtt.io/guide/faq/#common-error-codes

aurimasniekis commented 1 year ago

MAC channel access failure' (225) means there is interference, this cannot be solved from the firmware. See https://www.zigbee2mqtt.io/guide/faq/#common-error-codes

I was also getting this but before this version never seen this, and nothing changed plus almost no wifi signals at coordinator location as everything is still under construction. So I feel something withing firmware triggered these errors.

rogerjak commented 1 year ago

Sorry, but much more worse than 20221226. My end devices, Aqara/Hue PIRs misses all the time. Nothing changed but firmware (using ZHA and Sonoff Plus).

pannal commented 1 year ago

Just to counter the negative: I've been running 20230507 for weeks and haven't seen any major issues (zzh!).

aleskovets commented 1 year ago

Pretty much destroyed a network of primarily Aqara + Sonoff devices. Huge delivery delays, missing routes and Mac access issues.

UPD: rollback to 20221226 fixes it, so definitely attributed to update

ahd71 commented 1 year ago

Same here as @pannal. I'm having 200+ devices (zzh stick and this firmware). Rock solid for me. Both 20230507 and 20221226 (didnt tried 20230410)

aurimasniekis commented 1 year ago

having 200+ devices (zzh stick

Isn't CC2652R limited to a maximum of 200 devices in the best conditions?

Koenkk commented 1 year ago

@aurimasniekis there is only a limit on the amount of Zigbee 3.0 devices that can join, you can theoretically join an endless amount of 1.2 devices.

MattWestb commented 1 year ago

For zigbee 3 devices is the coordinator storage of trust center link keys the limited and i dont knowing how its configured in the firmware.

Older not Zigbee 3 devices like ZLL and HA 1.X dont using TC-Link keys so max 64K devices.

EZSP is having the 64K as max also with Zigbee 3 device then Z2M is using hashed link keys = not strange for every devices only the hash key seed the coordinator is storing in the NVM.

propi62 commented 1 year ago

i had alot of problems with latest updates after i changed te transmit power to 20dbm. thene i changed the transmit power back to standard 9dbm and everything is working very good again. (problem=stick is screaming to devices, but devices cant send back direct to coordinator.) maybe its a solution for someones problems.

johnlento commented 1 year ago

So I have two zbdongles as routers at 9db but don't even know if I can change the coordinator power. The routers are on an older firmware as I didn't see a 0507 router.

I do notice that most of my sengleds lights drop off after a few days to a week. I've had luck power cycling them via an automation that detects when they go offline and it keeps them running. It's almost like some devices go into a death spiral and contest the network till they are rebooted.

Sent via RFC 1149 Get some Carrier Pigeonshttps://en.m.wikipedia.org/wiki/IP_over_Avian_Carriers...


From: propi62 @.> Sent: Monday, July 10, 2023 4:59:51 PM To: Koenkk/Z-Stack-firmware @.> Cc: johnlento @.>; Mention @.> Subject: Re: [Koenkk/Z-Stack-firmware] Z-Stack_3.x.0 coordinator 20230507 feedback (Issue #439)

i had alot of problems with latest updates after i changed te transmit power to 20dbm. thene i changed the transmit power back to standard 9dbm and everything is working very good again. (problem=stick is screaming to devices, but devices cant send back direct to coordinator.) maybe its a solution for someones problems.

— Reply to this email directly, view it on GitHubhttps://github.com/Koenkk/Z-Stack-firmware/issues/439#issuecomment-1629724069, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGHSRT3TM7V4HAQB5HRAU2LXPRUMPANCNFSM6AAAAAAWPPN23Q. You are receiving this because you were mentioned.Message ID: @.***>

Claudio1L commented 1 year ago

After one day from the upgrade of z2m to 1.32.1 I started having massive devices offline for NWK_TABLE_FULL with 59 devices. Need to reboot and pairing some of them. Never seen before

https://github.com/Koenkk/zigbee2mqtt/issues/18287

marc-gist commented 1 year ago

Pretty much destroyed a network of primarily Aqara + Sonoff devices. Huge delivery delays, missing routes and Mac access issues.

UPD: rollback to 20221226 fixes it, so definitely attributed to update

seeing a ton of issues with my network as well, lots of errors in log with "failed to configure" even though not new devices. Rolling back firmware to 20221226 as well to hopefully fix the slow/non-responsiveness of my network now :(

l-mb commented 1 year ago

Running CC1352P2_CC2652P_launchpad_coordinator_20230507 on my Sonoff_Zigbee_3.0_USB_Dongle_Plus with zigbee2mqtt (latest as of today). 52 device total (not all of them always on.)

It works, but I see significantly increased latency (several seconds) to the point of making the network unusable.

2023-07-11 22:15:03 Publish 'set' 'brightness' to '0x00XXXXXXXXX' failed: 'Error: Command 0x00XXXXXXX/11 genLevelCtrl.moveToLevelWithOnOff({"level":254,"transtime":10}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Timeout - 47713 - 11 - 210 - 8 - 11 after 10000ms)'

Is a typical error.

Was running 20221226 before and am considering to downgrade.

marc-gist commented 1 year ago

Running CC1352P2_CC2652P_launchpad_coordinator_20230507 on my Sonoff_Zigbee_3.0_USB_Dongle_Plus with zigbee2mqtt (latest as of today). 52 device total (not all of them always on.)

It works, but I see significantly increased latency (several seconds) to the point of making the network unusable.

2023-07-11 22:15:03 Publish 'set' 'brightness' to '0x00XXXXXXXXX' failed: 'Error: Command 0x00XXXXXXX/11 genLevelCtrl.moveToLevelWithOnOff({"level":254,"transtime":10}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Timeout - 47713 - 11 - 210 - 8 - 11 after 10000ms)'

Is a typical error.

Was running 20221226 before and am considering to downgrade.

Roll back... instantly fixed my network delay/freezing issue!

cloudbr34k84 commented 1 year ago

Well i had my first disaster on this firmware. I went to add another device, and for some reason it completely broke

info  2023-07-13 17:34:20: Logging to console and directory: '/config/zigbee2mqtt/log/2023-07-13.17-34-20' filename: log.txt
info  2023-07-13 17:34:20: Starting Zigbee2MQTT version 1.32.0-dev (commit #08e82e4)
info  2023-07-13 17:34:20: Starting zigbee-herdsman (0.16.0)
error 2023-07-13 17:36:00: Error while starting zigbee-herdsman
error 2023-07-13 17:36:00: Failed to start zigbee
error 2023-07-13 17:36:00: Check https://www.zigbee2mqtt.io/guide/installation/20_zigbee2mqtt-fails-to-start.html for possible solutions
error 2023-07-13 17:36:00: Exiting...
error 2023-07-13 17:36:00: Error: SRSP - ZDO - startupFromApp after 40000ms
    at Timeout._onTimeout (/app/node_modules/zigbee-herdsman/src/utils/waitress.ts:64:35)
    at listOnTimeout (node:internal/timers:559:17)
    at processTimers (node:internal/timers:502:7)

I have had these in the past, and i can usually just restart my PoE Zigstar, but it was not having any of it. Im re flashing to see if that helps. It just does not want to connect anymore.

2023-07-13T07:50:31.434Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,1,97,29,0,125]
2023-07-13T07:50:31.434Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,1,97,29,0,125]
2023-07-13T07:50:31.434Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 1 - 3 - 1 - 29 - [0] - 125
2023-07-13T07:50:31.434Z zigbee-herdsman:adapter:zStack:znp:SRSP <-- SYS - osalNvWriteExt - {"status":0}
2023-07-13T07:50:31.434Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext []
2023-07-13T07:50:31.434Z zigbee-herdsman:adapter:zStack:startup:commissioning giving adapter some time to settle
2023-07-13T07:50:32.436Z zigbee-herdsman:adapter:zStack:startup adapter reset requested
2023-07-13T07:50:32.436Z zigbee-herdsman:adapter:zStack:znp:AREQ --> SYS - resetReq - {"type":1}
2023-07-13T07:50:32.436Z zigbee-herdsman:adapter:zStack:unpi:writer --> frame [254,1,65,0,1,65]
2023-07-13T07:50:34.429Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,6,65,128,0,2,1,2,7,1,192]
2023-07-13T07:50:34.429Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,6,65,128,0,2,1,2,7,1,192]
2023-07-13T07:50:34.429Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 6 - 2 - 1 - 128 - [0,2,1,2,7,1] - 192
2023-07-13T07:50:34.430Z zigbee-herdsman:adapter:zStack:znp:AREQ <-- SYS - resetInd - {"reason":0,"transportrev":2,"productid":1,"majorrel":2,"minorrel":7,"hwrev":1}
2023-07-13T07:50:34.430Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext []
2023-07-13T07:50:34.430Z zigbee-herdsman:adapter:zStack:startup adapter reset successful
2023-07-13T07:50:34.431Z zigbee-herdsman:adapter:zStack:znp:SREQ --> UTIL - getDeviceInfo - {}
2023-07-13T07:50:34.431Z zigbee-herdsman:adapter:zStack:unpi:writer --> frame [254,0,39,0,39]
2023-07-13T07:50:34.453Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,14,103,0,0,32,212,108,37,0,75,18,0,254,255,7,0,0,139]
2023-07-13T07:50:34.453Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,14,103,0,0,32,212,108,37,0,75,18,0,254,255,7,0,0,139]
2023-07-13T07:50:34.453Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 14 - 3 - 7 - 0 - [0,32,212,108,37,0,75,18,0,254,255,7,0,0] - 139
2023-07-13T07:50:34.454Z zigbee-herdsman:adapter:zStack:znp:SRSP <-- UTIL - getDeviceInfo - {"status":0,"ieeeaddr":"0x00124b00256cd420","shortaddr":65534,"devicetype":7,"devicestate":0,"numassocdevices":0,"assocdeviceslist":[]}
2023-07-13T07:50:34.455Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext []
2023-07-13T07:50:34.455Z zigbee-herdsman:adapter:zStack:startup starting adapter as coordinator
2023-07-13T07:50:34.455Z zigbee-herdsman:adapter:zStack:znp:SREQ --> ZDO - startupFromApp - {"startdelay":100}
2023-07-13T07:50:34.455Z zigbee-herdsman:adapter:zStack:unpi:writer --> frame [254,2,37,64,100,0,3]
joschaschultze commented 1 year ago

Just rolled back to 20221226 because I had big problems with Aqara devices losing connection all the time. Never experienced this behaviour on 20221226 but with this 20230507.

reyhard commented 1 year ago

Same as people above - after few days of using it in rather moderate network (50 devices, couple of Aqara ones too), I usually end up in state where coordinator no longer works and I have to replug it in order to have it working again. I've reverted few days ago to 20221226 and problem seems to be gone