Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
11.75k stars 1.64k forks source link

Lost routes with new firmware #1536

Closed jasimancas closed 5 years ago

jasimancas commented 5 years ago

Hi! I have Z2M 1.4.0, i update my CC2531 coordinator with the new firmware, all right the first hours, but as the day progresses the devices start to stop working, the log tells me that there are no routes, I enter the map and I find this: image

Reflash the old firmware (max_stability) and in the same time 0 problems: image

Any idea? Reflash 2 coordinator and same result.

Bojan023 commented 5 years ago

I encounter the same problems. No mesh network can be created somehow

Imperial-Guard commented 5 years ago

Any idea where I can download the older stable version? Got the same issue here

chris-jennings commented 5 years ago

1.4 is causing me problems as well. I can't form a mesh and when I re-pair devices it knocks out other devices because only the coordinator has active routes. I will re-flash with the older firmware to see if I get back normal functionality tonight.

Koenkk commented 5 years ago

Can you post the log when having this 205 network errors?

Koenkk commented 5 years ago

For users wanting to downgrade, older firmwares can be found here: https://github.com/Koenkk/Z-Stack-firmware/tree/c36ba37ebb10690245ba29d1360a21b5abfa9c43/coordinator/Z-Stack_Home_1.2/max_stability

jasimancas commented 5 years ago

This is part of this log, i have all in a same txt file (good firm and bad firm):

5/15/2019, 10:36:50 PM - error: Failed to setup reporting for 0xd0cf5efffe1d7098 - genOnOff - 1 - (Error: AF data request fails, status code: 205. No network route. Please confirm that the device has (re)joined the network.) 5/15/2019, 10:36:50 PM - error: Failed to setup reporting for 0x90fd9ffffeee1239 - genLevelCtrl - 1 - (Error: AF data request fails, status code: 205. No network route. Please confirm that the device has (re)joined the network.) 5/15/2019, 10:36:51 PM - error: Failed to setup reporting for 0x90fd9ffffeee1239 - lightingColorCtrl - 3 - (Error: AF data request fails, status code: 205. No network route. Please confirm that the device has (re)joined the network.) 5/15/2019, 10:36:51 PM - error: Failed to setup reporting for 0x90fd9ffffed9bc1e - genOnOff - 1 - (Error: AF data request fails, status code: 205. No network route. Please confirm that the device has (re)joined the network.) 5/15/2019, 10:36:51 PM - error: Failed to setup reporting for 0xd0cf5efffe1d7098 - genLevelCtrl - 1 - (Error: AF data request fails, status code: 205. No network route. Please confirm that the device has (re)joined the network.) 5/15/2019, 10:36:52 PM - warn: Failed to configure Mando Salon (0xd0cf5efffe18c94b) ('Error: Timed out after 10000 ms') (attempt #1) 5/15/2019, 10:36:52 PM - warn: This can be ignored if the device is working properly 5/15/2019, 10:36:52 PM - error: Failed to setup reporting for 0xd0cf5efffe1d7098 - lightingColorCtrl - 3 - (Error: AF data request fails, status code: 205. No network route. Please confirm that the device has (re)joined the network.) 5/15/2019, 10:36:52 PM - error: Failed to setup reporting for 0x90fd9ffffed9bc1e - genLevelCtrl - 1 - (Error: AF data request fails, status code: 205. No network route. Please confirm that the device has (re)joined the network.) 5/15/2019, 10:36:53 PM - error: Failed to setup reporting for 0x90fd9ffffed9bc1e - lightingColorCtrl - 3 - (Error: AF data request fails, status code: 205. No network route. Please confirm that the device has (re)joined the network.) 5/15/2019, 10:36:53 PM - error: Failed to setup reporting for 0x90fd9ffffef807de - genOnOff - 1 - (Error: AF data request fails, status code: 205. No network route. Please confirm that the device has (re)joined the network.) 5/15/2019, 10:36:53 PM - error: Failed to setup reporting for 0x90fd9ffffef807de - genLevelCtrl - 1 - (Error: AF data request fails, status code: 205. No network route. Please confirm that the device has (re)joined the network.) 5/15/2019, 10:36:54 PM - error: Failed to setup reporting for 0x90fd9ffffef807de - lightingColorCtrl - 3 - (Error: AF data request fails, status code: 205. No network route. Please confirm that the device has (re)joined the network.)

At the moment 0 problems with the older firm. Another thing that happened to me with the new firmware is that the answers were longer, pressed and took almost 1 second to answer the bulb.

Imperial-Guard commented 5 years ago

205 issues here:

  zigbee2mqtt:error 5/16/2019, 6:54:30 PM Zigbee publish to device '0xd0cf5efffe                                                                                                             2ed08a', genOnOff - on - {} - {"manufSpec":0,"disDefaultRsp":0} - null failed wi                                                                                                             th error Error: AF data request fails, status code: 205. No network route. Pleas                                                                                                             e confirm that the device has (re)joined the network.
  zigbee2mqtt:error 5/16/2019, 8:21:44 PM Zigbee publish to device '0xd0cf5efffe                                                                                                             cb89ad', genLevelCtrl - moveToLevelWithOnOff - {"level":254,"transtime":0} - {"m                                                                                                             anufSpec":0,"disDefaultRsp":0} - null failed with error Error: AF data request f                                                                                                             ails, status code: 233. MAC no ack.
  zigbee2mqtt:error 5/16/2019, 9:15:02 PM Zigbee publish to device '0x90fd9ffffe                                                                                                             010c68', genOnOff - off - {} - {"manufSpec":0,"disDefaultRsp":0} - null failed w                                                                                                             ith error Error: AF data request fails, status code: 233. MAC no ack.
  zigbee2mqtt:error 5/16/2019, 9:15:02 PM Zigbee publish to device '0x00158d0001                                                                                                             922ada', hvacThermostat - write - [{"attrId":16387,"dataType":41,"attrData":1600                                                                                                             }] - {"manufSpec":1,"manufCode":4151} - null failed with error Error: AF data re                                                                                                             quest fails, status code: 205. No network route. Please confirm that the device                                                                                                              has (re)joined the network.
  zigbee2mqtt:error 5/17/2019, 4:05:30 AM Zigbee publish to device '0xd0cf5efffe                                                                                                             2ed08a', genOnOff - off - {} - {"manufSpec":0,"disDefaultRsp":0} - null failed w                                                                                                             ith error Error: AF data request fails, status code: 205. No network route. Plea                                                                                                             se confirm that the device has (re)joined the network.
  zigbee2mqtt:error 5/17/2019, 4:53:04 AM Zigbee publish to device '0x90fd9ffffe                                                                                                             010c68', genOnOff - on - {} - {"manufSpec":0,"disDefaultRsp":0} - null failed wi                                                                                                             th error Error: AF data request fails, status code: 205. No network route. Pleas                                                                                                             e confirm that the device has (re)joined the network.
  zigbee2mqtt:error 5/17/2019, 4:53:31 AM Zigbee publish to device '0xd0cf5efffe                                                                                                             2ed08a', genOnOff - off - {} - {"manufSpec":0,"disDefaultRsp":0} - null failed w                                                                                                             ith error Error: AF data request fails, status code: 205. No network route. Plea                                                                                                             se confirm that the device has (re)joined the network.
  zigbee2mqtt:error 5/17/2019, 5:14:33 AM Zigbee publish to device '0xd0cf5efffe                                                                                                             cb89ad', genLevelCtrl - moveToLevelWithOnOff - {"level":254,"transtime":0} - {"m                                                                                                             anufSpec":0,"disDefaultRsp":0} - null failed with error Error: AF data request f                                                                                                             ails, status code: 205. No network route. Please confirm that the device has (re                                                                                                             )joined the network.
  zigbee2mqtt:error 5/17/2019, 5:20:39 AM Zigbee publish to device '0xd0cf5efffe                                                                                                             2ed08a', genOnOff - off - {} - {"manufSpec":0,"disDefaultRsp":0} - null failed w                                                                                                             ith error Error: AF data request fails, status code: 205. No network route. Plea                                                                                                             se confirm that the device has (re)joined the network.
  zigbee2mqtt:error 5/17/2019, 5:27:43 AM Zigbee publish to device '0x90fd9ffffe                                                                                                             010c68', genOnOff - off - {} - {"manufSpec":0,"disDefaultRsp":0} - null failed w                                                                                                             ith error Error: AF data request fails, status code: 233. MAC no ack.
rancher@RancherOS:~$
Koenkk commented 5 years ago

Does this happen when sending a burst of commands?

jasimancas commented 5 years ago

Press button -> Turn On light (fail) Press button -> Turn Off light (fail) Open door -> Turn On light (fail) Nothing more, only this, simple and basic orders.

Imperial-Guard commented 5 years ago

Does this happen when sending a burst of commands?

No max 2 / 3 devices.

Koenkk commented 5 years ago
Imperial-Guard commented 5 years ago
  • So it even fails when just controlling a single bulb?
  • Did commands to the same bulb succeed before (with new firmware)?

Yes and Yes :)

jasimancas commented 5 years ago
  • So it even fails when just controlling a single bulb?
  • Did commands to the same bulb succeed before (with new firmware)?

Yes and yes. I dont touch anything in HomeAssistant, only update Z2M and Firmware. Same devices, same platform, same command, etc.

Koenkk commented 5 years ago

Can you try to re-pair the specific bulbs?

Background information This probably happens because the source routing has been disabled in order to fix #1408. Meaning that we are now in a deadlock situation (we either have one of the issues). Therefore I would to check if we can workaround this.

jasimancas commented 5 years ago

I have 15 bulbs, all with this problem, its random, I repair one of this, and the problem continue. The strangest thing is the difference in the images of the connections of the CC2530_2591, are the backbone of the house, all devices are connected to them with the old firmware, however with the new not (as shown in the images).

chris-jennings commented 5 years ago

I find that Hue bulbs seem to have a big impact on the mesh routing, but this always seems to have been an issue (even with the old firmware). Obviously with the new firmware and source routing disabled if the bulbs are not routing correctly then the mesh stops working. I have enough other devices (DIY router, Xiaomi plugs) to not let the bulbs act as routers. Is there any way to disable the Hue bulbs from being routers?

Smiggel commented 5 years ago

Same issue here. I upgrades to Z2M 1.4.0 tonight and i started to notice that my nodes are dropping from the network. Did not flash my nodes. I run the latest stable already.

I use Domoticz. Zigbee network was really stable.

Koenkk commented 5 years ago

@Smiggel what firmware version are you running?

Smiggel commented 5 years ago

@Koenkk My coordinator is running version 20190223 and my routers are running version 2018_09 (CC2531 sticks). That is stable with Z2M 1.1.1. When I upgrade to 1.4.0, the routers drop of the network after less then 30 minutes.

Koenkk commented 5 years ago

@Smiggel what do you see in the log? How do you know they drop out of the network?

Smiggel commented 5 years ago

@Koenkk When I watch the logs, I see the link quality drop from 50+ to zero. And after a while the nodes don’t show up in the logs. Also the sensors downstairs (coordinator is on de second floor) are big longer visible in the logs. I need the routers to get the signal on the top floor.

glentakahashi commented 5 years ago

After upgrading to the latest firmware i'm seeing the same issue. They're still in the network, as if I add them to a group and use that to power my bulbs, I can still control them even though trying to ping the bulbs directly throws the 205 like others see.

Koenkk commented 5 years ago

Seeing that the disabling source routing does more harm than good, I've enabled it again. CC2531 firmware can be found here: https://github.com/Koenkk/Z-Stack-firmware/blob/dev/coordinator/Z-Stack_Home_1.2/bin/CC2531_20190523.zip (it still allows for 25 direct children). Can you guys confirm it's as stable as the max stability firmware?

Smiggel commented 5 years ago

Seeing that the disabling source routing does more harm than good, I've enabled it again. CC2531 firmware can be found here: https://github.com/Koenkk/Z-Stack-firmware/blob/dev/coordinator/Z-Stack_Home_1.2/bin/CC2531_20190523.zip (it still allows for 25 direct children). Can you guys confirm it's as stable as the max stability firmware?

I flashed a CC2531 stick with this firmware. At first I see my closest router show up with a link quality of 47. Then it drops to 45 and the next time it shows up in the logs it is at 0. I have a Xiaomi temperature/humidity/bar sensor about 60 cm away from my coordinator and even that sensor shows up with a link quality of 0. Matter of fact, all sensors, except for two Xiaomi PIR sensors, have a link quality of 0 with the new firmware.

With my older firmware (CC2531ZNP-Prod_20190223 the link quality of my closest router (one floor down) mostly about 40.

I Run Zigbee2mqtt version 1.1.1 (commit #e40a3ba) in a Pi3b+ with Domoticz.

EDIT: Switched back to my older stick with the older firmware and see the link quality is back to normal. Closest router is a 52 now.

MrLight commented 5 years ago

Same for me. Actually no log available. But after flashing the coordinator with the latest firmware (and with the one out of this conversation) The network-graph is showing good routes (twoway + mesh) for some minutes after restart and drops all routes after time. It shows only oneway connection from the routers to the coordinator but no connection from the coordinator to the routers...

Koenkk commented 5 years ago

@MrLight @Smiggel besides the networkmap, is controlling devices working reliable?

Smiggel commented 5 years ago

@MrLight @Smiggel besides the networkmap, is controlling devices working reliable?

In my case not all sensors respond. My routers are not visible. That could be the reason why not all pir and contact sensors respond, because of the distance.

james-fry commented 5 years ago

I seem to be having this issue and its not related to the latest firmware AFAIK. Pretty sure this was happening before 1.4.0, too.

This is my network when all working OK: good map

This is the network when all routers are dropped: broken map

Once this happens any nodes > number supported directly by the cooodinator are dropped from the network.

Restarting Z2M seems to resolve it.

james-fry commented 5 years ago
Koenkk commented 5 years ago

@james-fry in that case you are not experiencing this issue but #652

james-fry commented 5 years ago

My issue is not just a problem with map. Because the routers become non functioning a number of my sensors become unavailable. I think the map is accurately reflecting that the routers are dropped from the network and some endpoints are disconnected

Fabiancrg commented 5 years ago

I am also running Z2M v1.4.0 (commit #0074c32) with coordinator firmware version '20190425' It's running without any issue so why will this be linked to source routing ?

Koenkk commented 5 years ago

@Fabiancrg it depends on the setup I guess

abmantis commented 5 years ago

I'm also having the same behavior. Some devices stopped working so I checked the network map and all the indirect routes to the coordinator were gone. Only direct connections to the coordinator remained. Restarting z2m restored normal behavior again. Coordinator firmware version: '20190223' Using the dev branch, commit 825f096434a0508e5a97045cca23e5864a88dc45

MrLight commented 5 years ago

Update from my side: I'm using the actual coordinator cc2531 firmware and I have reflashed my two routers once again (cc2530 - same firmware as before | not sure if this is important) . I'm using the latest master branch. Yesterday evening all my routers were connected to the mesh and communication was up and running. In the morning it looked like the mesh was broken and all devices were assigned to the coordinator. After that I moved one device into a physical area were the coordinator was out of range but one router was reachable. After forcing some transmit from the device the complete mesh was rebuilt or visible again in the networkmap. So I'm not 100% sure if I really have a problem or if it is only the map... My actual feeling is that the direction from device to coordinator is working. But the direction from coordinator to device is not working when the mesh is broken... But I have to double check that.

Koenkk commented 5 years ago

Note that this issue is not about things not showing in the network map. It's about errors after flashing 20190425 firmware (like https://github.com/Koenkk/zigbee2mqtt/issues/1536#issuecomment-493335557). For issues in the networkmap please refer to https://github.com/Koenkk/zigbee2mqtt/issues/652 to avoid cluttering of this issue.

Smiggel commented 5 years ago

Ok, i just did another try. I stopped Zigbee2mqtt and switched sticks. I then started Zigbee2mqtt and after it started, I did a restart. My nodes are now still visible and have a link quality of 50+. My routers are still on the 3 month old version. I did upgrade Zigbee2mqtt to 1.4. That runs fine now too, as before it would not start.

I will keep monitoring the status of my network for the rest of the evening and night. Hopefully the restart fixed it.

[edit] After the restart it’s still stable. Looks all good now.

MrLight commented 5 years ago

Update from my side: System is stable with latest coordinator firmware and latest master branch. Thank you very much!

Koenkk commented 5 years ago

@MrLight what version number of the firmware?

MrLight commented 5 years ago

coordinator: 20190523 zigbee2mqtt 1.4.0 | commit: 927c4db

miroslavpetrov commented 5 years ago

I also updated zigbee2mqtt to 1.4 and flashed my cc2531 coordinator with 3.0 firmware 20190425. The link quality of my sensors went from around 80 to 30 and no traffic was forwarded from the cc2530 routers that I have. I went back to 1.2 firmware 20190425 and the routers started to work again but the problem with the link quality is still here. 2 of the xiaomi temp and hum sensors are unable to send data to the coordinator. If I press the button on the sensor few times to send data manualy it works after the 5th or 6th press.

Smiggel commented 5 years ago

@miroslavpetrov Did you try restarting Zigbee2mqtt? Helped most of us, including me.

miroslavpetrov commented 5 years ago

@Smiggel I also did complete reinstall and added the nodes again.

Ton1965 commented 5 years ago

@miroslavpetrov,

Could it be your zigbee channel changed after re-installation? I sometimes have intermittent problems with Xiaomi sensors; link quality just drops significantly. This can last for a few hours then everything is good again. I attribute this to some external interference. My guess is that you might have inadvertently changed the channel and got higher interference with WiFi or some other source. It's just a guess though.

UnrealKazu commented 5 years ago

After upgrading to 1.4 and firmware 20190425, I encountered the same 205 errors as jasimancas had. My sensor end devices were still operating, but the routers were not. When I flashed the 20190523 firmware from this thread, the routers still did not function, but as soon as soon as I restarted zigbee2mqtt with pairing enabled, the routers repaired themselves and started working again.

I've been running a stable network for two days now. Not sure if my problem is directly related to jasimancas', but maybe this piece of information has some value.

enomam commented 5 years ago

Is everyone just upgrading their coordinator firmwares, or router firmware too?

Smiggel commented 5 years ago

Is everyone just upgrading their coordinator firmwares, or router firmware too?

The problem for me happened after upgrading only the coordinator firmware. The router firmware is working fine and does not need an upgrade.

UnrealKazu commented 5 years ago

Is everyone just upgrading their coordinator firmwares, or router firmware too?

I do not have custom routers, only third party routers (TRADFRI stuff). So I only upgraded the coordinator.

vogler commented 5 years ago

@Koenkk Could you please also update the CC2530 coordinator firmware? Can it only be built using Windows?

Smiggel commented 5 years ago

Perhaps useful. I paired all my zigbee devices to a Home Assistant installation. Routers show up just fine. Not with 0 link quality and don’t have to restart Zigbee2mqtt.