Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
11.75k stars 1.64k forks source link

Does the zigbee network map work correctly? #652

Closed mihalski closed 4 years ago

mihalski commented 5 years ago

Does the zigbee network map work as it should? Here is mine:

digraph G { node[shape=record]; "0x00124b001202328b" [label="{0x00124b001202328b|Coordinator|No model information available|online}"]; "0x00158d000123df04" [label="{study_button|EndDevice|Xiaomi Aqara wireless switch (WXKG11LM)|online}"]; "0x00158d000200d0c4" [label="{bedroom_motion|EndDevice|Xiaomi Aqara human body movement and illuminance sensor (RTCGQ11LM)|online}"]; "0x00158d00017206c0" [label="{office_door|EndDevice|Xiaomi MiJia door & window contact sensor (MCCGQ01LM)|online}"]; "0x00158d0001720704" [label="{bedroom_door|EndDevice|Xiaomi MiJia door & window contact sensor (MCCGQ01LM)|online}"]; "0x00158d0002476ba9" [label="{bedroom_temp|EndDevice|Xiaomi Aqara temperature, humidity and pressure sensor (WSDCGQ11LM)|online}"]; "0x00158d0002476ba9" -> "0x00124b001202328b" [label="94"] "0x00158d00023f5036" [label="{bathroom_temp|EndDevice|Xiaomi Aqara temperature, humidity and pressure sensor (WSDCGQ11LM)|online}"]; "0x00158d0001872b69" [label="{living_room_button|EndDevice|Xiaomi MiJia wireless switch (WXKG01LM)|online}"]; "0x00158d0002476b3c" [label="{outside_temp|EndDevice|Xiaomi Aqara temperature, humidity and pressure sensor (WSDCGQ11LM)|online}"]; "0x00158d0001a668e8" [label="{master_bedroom_button|EndDevice|Xiaomi Aqara wireless switch (WXKG11LM)|online}"]; "0x00158d0001a668e8" -> "0x00124b001202328b" [label="45"] "0x00158d00019dee01" [label="{spare_bedroom_button|EndDevice|Xiaomi MiJia wireless switch (WXKG01LM)|online}"]; "0x00158d00019dee01" -> "0x00124b001202328b" [label="71"] "0x00158d000216085f" [label="{master_bedroom_fan|Router|Xiaomi Mi power plug ZigBee (ZNCZ02LM)|online}"]; "0x00158d000216085f" -> "0x00124b001202328b" [label="62"] "0x00158d0001148b7e" [label="{office_cube|EndDevice|Xiaomi Mi smart home cube (MFKZQ01LM)|online}"]; "0x00158d0002437899" [label="{office_temp|EndDevice|Xiaomi Aqara temperature, humidity and pressure sensor (WSDCGQ11LM)|online}"]; "0x00158d00015da618" [label="{master_bedroom_motion|EndDevice|Xiaomi MiJia human body movement sensor (RTCGQ01LM)|online}"]; "0x00158d00015da618" -> "0x00124b001202328b" [label="72"] "0x00158d0001fa6453" [label="{spare_bedroom_booklight|Router|Xiaomi Mi power plug ZigBee (ZNCZ02LM)|online}"]; "0x00158d0001fa6453" -> "0x00124b001202328b" [label="69"] "0x00158d000204658a" [label="{master_bedroom_door|EndDevice|Xiaomi Aqara door & window contact sensor (MCCGQ11LM)|online}"]; "0x00158d000200e303" [label="{office_motion|EndDevice|Xiaomi Aqara human body movement and illuminance sensor (RTCGQ11LM)|online}"]; "0x00158d00015da4c3" [label="{spare_bedroom_motion|EndDevice|Xiaomi MiJia human body movement sensor (RTCGQ01LM)|online}"]; "0x00158d00015da604" [label="{stairs_motion|EndDevice|Xiaomi MiJia human body movement sensor (RTCGQ01LM)|online}"]; "0x00158d0001a2b36b" [label="{office_storage_motion|EndDevice|Xiaomi MiJia human body movement sensor (RTCGQ01LM)|online}"]; "0x00158d00026d387e" [label="{study_backlight|Router|Xiaomi Mi power plug ZigBee (ZNCZ02LM)|online}"]; "0x00158d00026d387e" -> "0x00124b001202328b" [label="7"] }

I'm confused by the lack of links with signal quality in the diagram to the online (and working) zigbee devices. Additionally, should a router be able to connect to another router if it is closer than the co-ordinator? Because in my scenario study_backlight would be much better off connecting via master_bedroom_fan. Link quality of the study_button in the same room as the study_backlight is 68 and I assume it's connecting to the study_backlight router even if the diagram doesn't show that.

Thanks in advance.

Regards, Michal

Koenkk commented 5 years ago

The networkmap is indeed still buggy (but it's not high on my priority list).

mihalski commented 5 years ago

Fair enough, how about router to router connections? Is that a supported feature or is that not part of the spec/implementation?

Koenkk commented 5 years ago

Some router to router connections are shown, but note that a device can communicated with multiple other devices (something which is not shown in the networkmap).

mihalski commented 5 years ago

Does that mean an end device can communicate with another end device to relay messages? And does this apply to low power battery powered devices? I'm slowly shifting devices from the Xiaomi gateway to zigbee2mqtt and am hoping this will make the Zigbee network more robust.

Koenkk commented 5 years ago

No, only routers can communicate with other (multiple) routers. end devices are sleeping most of the time.

danpowell88 commented 5 years ago

Is there a way to view any of this information and confirm what is happening without using the graph?

Koenkk commented 5 years ago

@danpowell88 the debug version the router firmwares show which childs it has.

danpowell88 commented 5 years ago

Do you have an example of what the output looks like?

On Mon, 17 Dec. 2018, 5:17 pm Koen Kanters <notifications@github.com wrote:

@danpowell88 https://github.com/danpowell88 the debug version the router firmwares show which childs it has.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Koenkk/zigbee2mqtt/issues/652#issuecomment-447745614, or mute the thread https://github.com/notifications/unsubscribe-auth/ABDKbca4peHDqD-Vpb8pZWWg--Q0FraLks5u50T2gaJpZM4ZAb6y .

milakov commented 5 years ago

I also observe that the network map generated has a number of online end devices, which are not connected to anything. At the same time 2 routers (Xiaomi smart plug) don't have any end devices connected to them. Could it be that end device to router connections are just not shown? Having correct network map would help to identify issues with connectivity and reliability, which is especially relevant for CC2531 USB dongle, which seems to have lower signal strength than XIaomi gateway.

clockbrain commented 5 years ago

The missing links in the network graph are due to a bug in the lqiScan code in zigbee-shepherd. Here https://github.com/Koenkk/zigbee-shepherd/blob/aa1b9496b7aca7a14c76330c5a70a99bb8fa7918/lib/shepherd.js#L440 it looks for duplicates devices by ieee address but this also has a side effect of limiting links to one per device, i.e. it omits many valid links. A mesh network should have many links per device. The deduplication instead needs to occur for links using a composite key of device and its parent. So code should change to something like this:

                key = ieeeAddr + '|' + parent
                if (dev && dev.type == "Router" && !noDuplicate[key]) {
                    chain = chain.then(function () {
                        return self.lqi(ieeeAddr).then(processResponse(ieeeAddr));
                    });
                }
                noDuplicate[key] = devinfo;

Making this change I get a much richer list of links coming back from lqiScan. A matching change would also need to be done in zigbee2mqtt network map code https://github.com/Koenkk/zigbee2mqtt/blob/9380bbcadf6a361e9c9e621f8175c3c30c2eba9d/lib/extension/networkMap.js#L79 to process the list of links for each device instead of one link per device. So the code should be something like:

            lqiDevices.forEach((lqiDevice) => {
                if (lqiDevice != undefined && lqiDevice.ieeeAddr == device.ieeeAddr) {
                    text += `  "${device.ieeeAddr}" -> "${lqiDevice.parent}" [label="${lqiDevice.lqi}"]\n`;
                }
            });

Unfortunately I couldn't actually get the code fully working - my limited nodeJS got in the way but this may give someone else the hint to get this working.

Koenkk commented 5 years ago

@clockbrain looks good, could you make a PR? We can further polish the code there.

clockbrain commented 5 years ago

@Koenkk i've made a PR for the change to zigbee-shepherd https://github.com/Koenkk/zigbee-shepherd/pull/9.

With this change I now get 18 links in my raw network map instead of 6 links.

I'm new to github etc and I don't know how to relate my fork of zigbee-shepherd to my fork of zigbee2mqtt so I'm stuck trying to get the zigbee2mqtt network graph to show the extra links.

Koenkk commented 5 years ago

@clockbrain thanks, I will take care of the zigbee2mqtt update.

Koenkk commented 5 years ago

@clockbrain merged and updated zigbee2mqtt.

lolorc commented 5 years ago

here is the result, still have unlinked devices here :) 20190119172445

bizziebis commented 5 years ago

This morning my network map showed really nice like this:

download

Then I removed two devices and added them back, and after that my network map looked like this:

download 1

It has been like this for some hours now.. Is there something I can do to improve the network map again? I tried restarting Zigbee2mqtt, removed the dongle. Waked up every device, still no change.

I'm on the latest DEV build.

clockbrain commented 5 years ago

@lolorc regarding the unlinked devices in the map - can you try requesting the map just after triggering those missing end devices. If end devices are asleep when the coordinator issues the lqi scan they may be missed. I also sometimes have unlinked end devices on my map.

@bizziebis does your network map still not work? Make sure you don't have zigbee2mqtt admin panel running as it spams the network with map requests. You could also try requesting a raw network map and counting the returned links but it is more likely that the network is busy during the lqi probe rather than something wrong with the graphviz generation code.

bizziebis commented 5 years ago

It was looking great when I rebuild the network and had 10 devices connected. Then I added 3 more including a CC2531 router, it didn't show the connection between routers anymore. Only between coordinator <-> router <-> end device.

About the busy network, I was thinking the same, as I see a lot of diag messages from my two routers. I'll try them with the non diag version so the network is more at rest.

Edit: I had to re-pair the whole network because the panId somehow got changed.. But now the network is looking good: 8zwave_mesh_0102_22 14

milakov commented 5 years ago

@clockbrain At some point I had a nice map with all devices connected. Not anymore. Using dev branch. It seems like the map shows connections from end devices to coordinator fine but no connections from end devices to routers. Waking up end devices (pressing reset buton shortly) and re-generating the map doesn't help. map

clockbrain commented 5 years ago

@milakov I see it is only end devices that aren't showing links. Are you sure these devices are all paired and actually working ok? Are all your routers shown on the map?

The online status for end devices doesn't necessarily mean they are actually online. zigbee2mqtt only checks online for routers. End devices are always shown as online regardless.

To gather the network links in response to a map request, each router is polled in turn and their neighbour table is examined. Given that you have reasonable links showing for your coordinator and some routers I guess that either a critical router to which many end devices are paired has gone away or many of these end devices have themselves dropped off your network.

milakov commented 5 years ago

Yes, all those end devices are paired and report that they are alive at least every hour (I am checking it with newly introduced last_seen option). And they all function well as well, temperature updates and button presses are coming to Home Assistant.

All routers are on the map.

bizziebis commented 5 years ago

I noticed that when the coordinator is pinging the routers, and get successfull reply, the network map is OK

2019-2-2 18:21:01 - debug: Ping 0x00124b00016f2bd5 2019-2-2 18:21:01 - debug: Successfully pinged 0x00124b00016f2bd5 2019-2-2 18:21:01 - debug: Ping 0x00158d0002370efc 2019-2-2 18:21:01 - debug: Successfully pinged 0x00158d0002370efc 2019-2-2 18:21:01 - debug: Ping 0xbfb6775ffeffe79d 2019-2-2 18:21:01 - debug: Successfully pinged 0xbfb6775ffeffe79d 2019-2-2 18:21:01 - debug: Ping 0x00158d00024d8998 2019-2-2 18:21:01 - debug: Successfully pinged 0x00158d00024d8998

When the ping of a router is not successfull, the network map seems incomplete

2019-2-2 18:48:35 - error: Failed to ping 0x00158d0002370efc

It's strange that at first the coordinator was pinging 4 routers with all positive results, and after a restart of Zigbee2MQTT service the coordinator was only pinging 1 router with a negative result. Nothing changed in the network between the 5 seconds of restarting.

What makes the coordinator decide which devices to ping, and what devices not to ping?

Koenkk commented 5 years ago

@bizziebis All xiamoi routers + CC2530/CC2531 routers are pinged.

When setting availability_timeout all devices will be pinged (https://koenkk.github.io/zigbee2mqtt/configuration/configuration.html)

bizziebis commented 5 years ago

I remember that there was only one change I made before my network map was incomplete again. I removed one sensor which was marked as unknown, and re-paired it. After that the network map was incomplete. Before the removal it was very detailed. That also happened the last time I rebuild the network. I don't know if it was a coincidence.

bizziebis commented 5 years ago

@Koenkk I just discovered something. I thought the network map got incomplete when one router was not able to be pinged. Turns out that router was only connected to the coordinator trough another (CC2531) router. Moving the router closer to the coordinator, so a direct link was established, made the complete network map show up again!

I could replicate it with a different router. It was also out of range of the coordinator at a moment and rendered half of the network map incomplete. Moving it closer solved it again.

quarcko commented 5 years ago

I have exactly same story going on, and as my network "heals" and some end devices little by little are getting behind the routers that are themselves behind the routers - network map gets incomplete. Although, all my devices are giving signals okay, and are operating just fine. I have 5 routers, all Xiaomi stuff (smart plugs and wall switches), and some 30 end devices.

Could it bee that "router behind a router" answers to some things like ping (or map) differently? Here is excerpt from logs:

1) Direct router, ping going OK: zigbee2mqtt:debug 2019-2-13 22:46:32 Check online 0x00158d00026eb005 0x00158d00026eb005 2019-02-13T20:46:32.646Z zigbee-shepherd:request REQ --> ZDO:nodeDescReq 2019-02-13T20:46:32.700Z zigbee-shepherd:msgHdlr IND <-- ZDO:nodeDescRsp zigbee2mqtt:debug 2019-2-13 22:46:32 Successfully pinged 0x00158d00026eb005

2) This is router behind a router, as you can see in zigbee-shepherd there is something goin on (im no expert on this library) and maybe it needs to be parsed additionally? zigbee2mqtt:debug 2019-2-13 22:46:32 Check online 0x00158d0002b701ae 0x00158d0002b701ae 2019-02-13T20:46:32.870Z zigbee-shepherd:request REQ --> ZDO:nodeDescReq 2019-02-13T20:46:36.834Z zigbee-shepherd:af dispatchIncomingMsg(): type: incomingMsg, msg: [object Object] 2019-02-13T20:46:36.844Z zigbee-shepherd:msgHdlr IND <-- AF:incomingMsg, transId: 0 2019-02-13T20:46:36.846Z zigbee-shepherd:af dispatchIncomingMsg(): type: zclIncomingMsg, msg: [object Object] 2019-02-13T20:46:37.888Z zigbee-shepherd:request REQ --> ZDO:nodeDescReq zigbee2mqtt:debug 2019-2-13 22:46:42 Failed to ping 0x00158d0002b701ae

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

milakov commented 5 years ago

The map is still very incomplete. I have all devices working properly for me (at least most of the time), but almost all end devices are not connected to anything: Capture

Koenkk commented 5 years ago

@milakov this is expected, as during scanning these are sleeping.

milakov commented 5 years ago

@Koenkk Notice I have a couple of Aqara weather sensors directly connected to the coordinator. I am pretty sure they are sleeping most of the time as well (battery powered) but they are always shown in the map with a link to the coordinator.

ualex73 commented 5 years ago

I have updated Zigbee2MQTT and after this, all my sensors are disconnected in the map (but working fine). I have waited 6 days, but nothing has changed in my map? Is there anything I can do/trigger to fix this?

quarcko commented 5 years ago

For me it helped to power off all routers for some time and power them back on.

I did it because was doing some maintenance at home and had to shut down whole house power.

After this map was complete.

Then i added 2 more routers and map is again incomplete. Will try same scheme sometime later.

My hassio is on UPS so it is not affected by powering off whole house for an hour. Coordinator was kept online.

My routers btw are all xiaomi wall switches with neutral.

⁣Sent from BlueMail ​

On Apr 15, 2019, 12:32, at 12:32, Alexander notifications@github.com wrote:

I have updated Zigbee2MQTT and after this, all my sensors are disconnected in the map (but working fine). I have waited 6 days, but nothing has changed in my map? Is there anything I can do/trigger to fix this?

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/Koenkk/zigbee2mqtt/issues/652#issuecomment-483179442

h4nc commented 5 years ago

I had the same and wrote down steps that fixed it for me somewhere here on GitHub.

If I remember correctly.

Unplugging the routers and restarting zigbee fixed it for me.

After that plug in your routers again.

Could also be that I had to move the routers nearer to the Coord at the restart.

milakov commented 5 years ago

Interestingly, when I run map update the coordinator send "Link Quality Request" to one of the routers (Gledopto bulb), it either responds with "Status: Not Supported"or doesn't respond at all, in both cases coordinator doesn't ask any other device about link quality. Could it be a bug?

Снимок

lolorc commented 5 years ago

@lolorc regarding the unlinked devices in the map - can you try requesting the map just after triggering those missing end devices. If end devices are asleep when the coordinator issues the lqi scan they may be missed. I also sometimes have unlinked end devices on my map.

@clockbrain nope, I've tried with aqara sensors, door sensors and 2 kinds of switches, the devices still appear as unlinked.

clockbrain commented 5 years ago

@Koenkk I've posted a PR https://github.com/Koenkk/zigbee-shepherd/pull/23 with some changes to lqiscan which make it easier to understand what is going on with map generation.

@mihalski The PR includes a change that swaps the order of map building. Previously it recursed into connected routers before collecting their local links. With this change it collects links before recursing. I don't quite understand how the error handling works with the promise style code but it is possible that previously one error from a router would abort the whole map. This PR may help with that. I don't have very many devices connected in my network so can't test at a large scale.

@lolorc sorry, now that I understand a bit better how the lqiscan builds the map, the advice I gave earlier is wrong. The lqi scan doesn't poll end devices to build the map. It gets all the info it needs from the coordinator and routers. Routers have all the info relating to their associated end devices in neighbor tables so no need to poll end devices.

Koenkk commented 5 years ago

@clockbrain I will look at it ASAP

clockbrain commented 5 years ago

@Koenkk I've had another go at restructuring the network map code. https://github.com/Koenkk/zigbee2mqtt/pull/1543 Testing needed!

Koenkk commented 5 years ago

@clockbrain I've merged your PR in the dev branch.

At all: can you test if things have been improved in the dev branch?

milakov commented 5 years ago

I've just installed the latest version: The map has no connections now. Zero.

bizziebis commented 5 years ago

Was about to post the same. No connection whatsoever.

clockbrain commented 5 years ago

@Koenkk looks like its premature to include it in dev. Can you revert that PR for the moment.

@milakov @bizziebis I'll look further into it and then ask you to test again once I add some more debug lines. Got to go to work just now.

clockbrain commented 5 years ago

@milakov @bizziebis I've updated the PR https://github.com/Koenkk/zigbee2mqtt/pull/1546 with a timeout to handle uncontactable devices which is what I think was preventing your maps from being generated. Are you in a position to test this straight from the code in the PR or do you need @Koenkk to merge it to the dev branch?

milakov commented 5 years ago

@clockbrain I am using z2m as hass.io add-on, I think the only way for me to check new changes is to have them them in dev branch. I can also sniff zigbee traffic if needed.

clockbrain commented 5 years ago

@milakov Ok, I guess you will need to wait on dev then. No need to capture zigbee traffic. Running with debug on (DEBUG=zigbee-shepherd* npm start 2>&1 | tee debug.txt) shows the lqi traffic, e.g. this is what I see in my log.

2019-05-20T02:02:14.854Z zigbee-shepherd:request REQ --> ZDO:mgmtLqiReq
2019-05-20T02:02:14.858Z zigbee-shepherd:request REQ --> ZDO:mgmtLqiReq
2019-05-20T02:02:14.859Z zigbee-shepherd:request REQ --> ZDO:mgmtLqiReq
2019-05-20T02:02:14.860Z zigbee-shepherd:request REQ --> ZDO:mgmtLqiReq
2019-05-20T02:02:14.899Z zigbee-shepherd:msgHdlr IND <-- ZDO:mgmtLqiRsp
2019-05-20T02:02:14.911Z zigbee-shepherd:msgHdlr IND <-- ZDO:srcRtgInd
2019-05-20T02:02:14.988Z zigbee-shepherd:msgHdlr IND <-- ZDO:mgmtLqiRsp
2019-05-20T02:02:14.995Z zigbee-shepherd:msgHdlr IND <-- ZDO:mgmtLqiRsp
2019-05-20T02:02:15.002Z zigbee-shepherd:msgHdlr IND <-- ZDO:mgmtLqiRsp
2019-05-20T02:02:15.028Z zigbee-shepherd:msgHdlr IND <-- ZDO:srcRtgInd

The problem with network map isn't generally the zigbee lqi traffic, its trying to corral all the asynchronous lqi calls back together to build the map. The PR code works fine for me but I do only have a fairly simple network hence asking for broader testing.

lolorc commented 5 years ago

it is indeed better with the timeout. with your initial PR, no devices were connected on my cc2652r networkmap (same PR on cc2530 was giving a proper network map) now it's also ok on cc2652r (more devices) nm

clockbrain commented 5 years ago

@Koenkk thanks for the merge. No, I don't think increasing the timeout is needed. It should return all the links it has gathered before the timeout so at a minimum it would have the direct coordinator links.

If anyone is still having problems with the network map in dev can you add a simple debug line

console.log(result);

just before here https://github.com/Koenkk/zigbee2mqtt/blob/e4a50b662d237439bae975422206b38cfbbb868c/lib/zigbee.js#L333 and post the error message.

milakov commented 5 years ago

@clockbrain Why don't you add debugging in a "normal" way? I updated to the latest dev, zero connections yet.

clockbrain commented 5 years ago

@Koenkk I've added another PR with extra network map debugging https://github.com/Koenkk/zigbee2mqtt/pull/1559

@milakov yes, I should have included that first time around but I JS isn't my first language (needed to do some quick study). Also, another avenue you could perhaps try is temporarily move your cc2531 and /data to a Windows PC and try to debug from there. zigbee2mqtt runs ok under Windows, See https://github.com/Koenkk/zigbee2mqtt/issues/648

milakov commented 5 years ago

@clockbrain That's an option! Another one is to modify the file inside the container, could probably work! Will try this evening.