zigpy / zigpy-deconz

A library which communicates with Deconz radios for zigpy
GNU General Public License v3.0
86 stars 20 forks source link

Source Routing on deconz based adapters like ConBee/RaspBee? #180

Closed Hedda closed 2 years ago

Hedda commented 2 years ago

I don't have a ConBee/RaspBee myself but heard dresden elektronik recommend enable "source routing" if you got many devices.

https://community.home-assistant.io/t/zha-conbee-ii-source-routing/270711

I am however not sure if zigpy-deconz support "source routing" yet if it was not available in earlier, so the question is, does it?

https://github.com/zigpy/zigpy-deconz

For reference see example XBee documentation on source routing (many-to-one routing):

https://www.digi.com/resources/documentation/digidocs/90001537/references/r_large_zigbee_networks-source_routing.htm?TocPath=Working%20with%20Zigbee%7C_____14

PS: dresden elektronik own deCONZ/Phoscon automatically enables "application-level source routing" when needed since v2.8.0

https://github.com/dresden-elektronik/deconz-rest-plugin/wiki/Source-Routing

https://phoscon.de/en/changelog

MattWestb commented 2 years ago

de(F)CONZ have adding "manuale" source routing thru the device GUI for over one year ago but i think they dont have it automatically as some Zigpy radios libs is having (Bellows).

manup commented 2 years ago

The firmware does support and uses normal source routing, same as it is applied by various other controllers. The propblem with this is that not all devices are proactive for this (sending Route Records) for example Philips Hue lights won't do that.

So what deCONZ does additionally is figuring out source routes on it's own based on the information of neighbor tables and error rates. These are then transferred to the firmware within outgoing APS Requests. This has the advantage that unlimited amount of source routes can be used and it doesn't matter if routers are sending Route Record commands, the largest network using this so far is around 290 nodes, but I think this can be pushed to 400.

puddly commented 2 years ago

Very interesting, this seems like it could solve some startup routing issues. I will definitely check this out if you document the change to the serial format (or provide an implementation hint 😄).

manup commented 2 years ago

It will be added by the next document version :) Usage is relative simple however the really tricky part is to actually figure out source routes which could work and test them for real, at best with alternatives. Ideally you know how the next hop sees the sending hop, not only plain forwarding.

Currently we only enable this for routers, for the end-devices we use "native" source routes based on the Route Record commands, which is done by the firmware. It's not always easy to really know the current parent of an end-device which is crucial.

Here is the deCONZ serialization code which packs an APS request into the serial protocol:

void ApsDataRequest::writeToStream(QDataStream &stream) const
{
    uint8_t flags = 0;

    stream << id(); // APS request id for confirm

    if (version() > 1)
    {
        if (nodeId() != APS_INVALID_NODE_ID)
            flags |= 0x01; // include node id

        if (d_ptr->relayCount > 0)
            flags |= 0x02;

        stream << flags; // flags are supported since version 2
    }

    if (flags & 0x01)
    {
        stream << nodeId(); // U16 
    }

    stream << (uint8_t)dstAddressMode();
    switch (dstAddressMode())
    {
    case ApsNoAddress:
        break;
    case ApsGroupAddress:
        stream << dstAddress().group();
        break;
    case ApsNwkAddress:
        stream << dstAddress().nwk();
        stream << dstEndpoint();
        break;
    case ApsExtAddress:
        stream << (quint64)dstAddress().ext();
        stream << dstEndpoint();
        break;
    default:         // invalid address mode
        break;
    }

    stream << profileId();
    stream << clusterId();
    stream << srcEndpoint();
    stream << (uint16_t)asdu().size();
    for (int i = 0; i < asdu().size(); i++)
    {
        stream << (uint8_t)asdu()[i];
    }
    stream << (uint8_t)txOptions();
    stream << radius();

    if (flags & 0x02)
    {
        stream << d_ptr->relayCount;

        for (quint8 i = 0; i < d_ptr->relayCount; i++)
        {
            stream << d_ptr->sourceRoute.at(i);
        }
    }
}

So basically the source route is appended at the end in form of:

U8  RelayCount
U16 NwkAddressRelay[RelayCount]   // reversed order, last hop comes first !!! 

The coordinator address 0x0000 must not be included in the relay list.

The flags byte at the beginning needs to have the 0x02 bit set in order to use a source route. There is a max of 9 relays, but in my tests I think 6 is already critical due the increased delays and error possibilities. APS ACKs should be used to mitigate errors.

Hedda commented 2 years ago

Currently we only enable this for routers, for the end-devices we use "native" source routes based on the Route Record commands, which is done by the firmware. It's not always easy to really know the current parent of an end-device which is crucial.

Do you also have a blacklist of known bad routers or a whitelist with known good routers based on manufacturers, model, or ID?

That is, some brands of Zigbee routers are well known for being bad routers for any other brands of devices other than their own.

If it possible with application level source routing to exclude specific router devices because you do not trust them as good router?

As I understand, quite a few Tuya routers for example are infamously known for not routing all messages from other brands.

I also read some Zigbee implementations (like for example Jeedom) tries to specifically avoid using lightbulbs as routers if possible because it is so very common that people connected Zigbee lightbulb type devices without removing the physical lightswitch and thus the risk of someone often flicking off that lightswitch is too big to make worth using them as routers.

manup commented 2 years ago

No there is no blocking of specific devices, when figuring out new source routes we check if previous/lower hops did work. So this adapts organically. So far I haven't seen much trouble with this, it takes a bit of time to figure out good routes (which hopefully gets a bit quicker soon), but overall I'm more in favor of test and verify rather than blocking.

The test/verify/discard needs to be done even for good routers :) I had quite some surprises in various setups with routes which I would have never seen by manually setting them up. The RF distribution through air difficult to predict even when looking at the LQI map.

In deCONZ it's possible to set routes manually, but I'd only recommend automatic routing as it adapts better to RF oddities. When some routers act up with high error rates they fall back in consideration for a new source route.

One goodie which is possible with application based source routing is to rate good routers which are always powered higher as for example lights which are sometimes physically switched off, for that we store source routes in the database to keep track of this, after a while the devices which are always/mostly powered become the top routing points.

There is still heaps to figure out and experiment with, it's a really interesting topic :)

Hedda commented 2 years ago

There is still heaps to figure out and experiment with, it's a really interesting topic :)

hehe, sounds like an excuse to start playing with AI and Machine Learning to find the best routes? ;)

Hedda commented 2 years ago

There is still heaps to figure out and experiment with, it's a really interesting topic :)

hehe, sounds like an excuse to start playing with AI and Machine Learning to find the best routes? ;)

On the topic of AI/ML for application-based source routing, FYI, Google just released a "network-opt" C++ library as open-source:

https://opensource.googleblog.com/2022/02/A-New-Library-for-Network-Optimization.html

https://github.com/google/network-opt

It is a library for topological network optimization based on a paper for "Search Strategies for Topological Network Optimization":

https://www.aaai.org/AAAI22Papers/AAAI-21.MoffittM.pdf

https://research.google/pubs/pub51051/

Library states "This is not an officially supported Google product" and the Google engineer Michael D. Moffitt who made it describe this network-opt library as "a C++ library that supports the optimization of network topologies. Using sophisticated techniques for combinatorial search, this algorithm can efficiently construct instances from a family of so-called series-parallel networks that commonly arise in electrical and telecommunications applications."

PS: Google engineer will present the library and paper this week at the "36TH AAAI Conference on Artificial Intelligence (AAAI-22)":

https://aaai.org/Conferences/AAAI-22/

https://aaai-2022.virtualchair.net/poster_aaai21