Open Extrapilot1 opened 1 month ago
Hey there @dmulcahey, @adminiuga, @puddly, @thejulianjes, mind taking a look at this issue as it has been labeled with an integration (zha
) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
zha documentation zha source (message by IssueLinks)
All values are straight from the firmware, unmodified. You can read about how they're computed in an article by Silicon Labs: https://community.silabs.com/s/article/lqi-in-silicon-labs-em3xx-and-efr32-parts?language=en_US
Well, thats unfortunate. Per the doc, the Skyconnect chip doesnt actually use bit error rate for LQI, it simply estimates LQI (RSSI of -100 to -36 maps to LQI of 0-255). It is no wonder people see all kinds of craziness where they see bad LQI and have good link performance, or vice versa. Given how this chip is now sort of linked with Home Assistant as the whole Skyconnect/ZBT thing, do you think it might make sense to not report LQI for it since it isnt real? Or at the least, it seems a note should be made on the Skyconnect sales page that the chip doesnt support proper LQI, and emulates it via RSSI if this isnt something that can/should be done in the interface?
This kind of nitty pitty detail is not something for the sales pages. If the info is wrong however, I think we should just remove these sensors from the integration. Better not provide the sensors than providing them with the wrong value.
Well, its hardly a nitty pitty detail. There are literally hundreds of threads on this issue of 'why' RSSI and LQI are inconsistent indicators. You have people spending days trying to understand why their pairing which shows an LQI of 120 and an RSSI of -40 demonstrates great link performance, and others with an LQI of 180 and an RSSI of -60 are seeing problems. Since the whole premise of a mesh is an intelligent route prioritization based on LQI from end to end, and not some random measurement of RF flux for the 20us at the time of receipt of the message, people are looking to LQI for indications as to the health of the mesh/route(s).
And in looking at 10s of these threads today trying to understand what I was seeing, not a single thread had a single post indicating that common coordinator hardware, including Skyconnect- a development of the Nabu Casa team specifically for HA- does not actually use LQI... It reports an LQI that has nothing to do with LQI...
So, Id suggest this is an important detail. I would not have purchased Skyconnect if I knew it didnt support LQI. Now that it is clear it is not going to support Matter+Zigbee in parallel, maybe people should know that it is not at all what was promised in the marketing collateral. And, people who purchased it, or hardware with the same chip, may be chasing their tails.
If RSSI is all that is available, it is better than nothing. At least it can be logged, and compared on a timeline with sensor drops. I just think there should be some indication that LQI isnt valid for this chipset etc.
people should know that it is not at all what was promised in the marketing collateral
This calculation is done by the Silicon Labs stack and is identical across every device (not just coordinators!) using an EFR32 chip. It's also exactly the same computation performed by Texas Instruments Z-Stack. And Nordic's nRF52840. Silicon Labs chips are in probably 90% of devices you use, with Texas Instruments and Nordic filling in the remaining 10%.
Keep in mind that every router on your Zigbee network does this calculation independently to compute path costs, it doesn't particularly matter what the coordinator does in isolation.
The problem
I enabled RSSI and LQI for my zigbee network, which runs 5 repeaters/routers and a Skyconnect USB link on an RPI4. Where there is variance on RSSI/LQI per pair through the day, and where these values do not parallel other pairs, they parallel each other exactly, on all 5 links. That is, it appears LQI is just an offset of RSSI, or vice versa. That is not supposed to be how these work- where RSSI is RF flux, and LQI can vary due to different message paths, or message corruption etc. I dont see any variance, on any of the links between the SkyConnect and the 5 routers placed throughout my residence.
The routers are not identical. 1 is a wall switch made by Ewelink, 2 are Sengled wall power switches, and 2 are Sonoff S31 Lite ZB wall power switches. Unlikely they all use the same chip, or the same stack, so this seems to be more a ZHA thing.
It may be fair to say- these values arent all that precise anyway- and Id have less a problem with the data IF there were significant variation between RSSI and LQI for a given link through the day. There isnt.
My concern is that if there is a fault in how LQI is calculated (it is just the inverse of RSSI etc), then it may be that the Skyconnect isnt sending messages to given endpoints via the highest LQI router for that device. That is, best I can understand, how the mesh functions, where the assumption is that the Skyconnect is center, not the end of a long line of routers, where load averaging is mostly a function of location/azimuth and not of some complex route planning as would be seen with OSPF or something like that in the Ethernet world.
Or, this may just be a bug, where externally this is presented to HA with RSSI as just some inverse of LQI? My assumption is that if RSSI isnt supported, it simply wouldnt be reported for a given link...
What version of Home Assistant Core has the issue?
2020.10.4
What was the last working version of Home Assistant Core?
2020.10.4
What type of installation are you running?
Home Assistant OS
Integration causing the issue
ZHA
Link to integration documentation on our website
https://www.home-assistant.io/integrations/zha/
Diagnostics information
NA
Example YAML snippet
Anything in the logs that might be useful for us?
Additional information
NA