nRF24 / RF24Mesh

OSI Layer 7 Mesh Networking for RF24Network & nrf24L01+ & nrf52x devices
http://nrf24.github.io/RF24Mesh
GNU General Public License v2.0
421 stars 154 forks source link

avoid multicasting to master node during handshake #201

Closed 2bndy5 closed 2 years ago

2bndy5 commented 2 years ago

There seems to be re-occurring problem in which some devices can't connect to a mesh network unless there is another node already connected (referencing #138 and possibly #200 though unconfirmed as of this writing). I also came across this problem when porting the network layers to pure python (the CirPy lib).

I blame this problem on not using auto-ack to poll nodes on network level 0 (which only has 1 node - the master node) in RF24Mesh::requestAddress(). https://github.com/nRF24/RF24Mesh/blob/6f8091e1e0f63f68ea8b7b516ded4c6ebcca904f/RF24Mesh.cpp#L292-L304 As a workaround, I added the SLOW_ADDR_POLL_RESPONSE macro to compensate for devices that aren't running as efficiently as the master node. More discussion about that solution is in nRF24/CircuitPython_nRF24L01#29.

Initial Proposal

Modify RF24Mesh::requestAddress() to use normal network.write(), so the NETWORK_POLL message is auto-ack'd from master node's pipe 5:

bool RF24Mesh::requestAddress(uint8_t level)
{
    RF24NetworkHeader header(MESH_MULTICAST_ADDRESS, NETWORK_POLL);
    #define MESH_MAXPOLLS 4
    uint16_t contactNode[MESH_MAXPOLLS];
    uint8_t pollCount = 0;
    if (!level) {
        // use auto-ack on network level 0 because there's only 1 node on that level
        header.to_node = static_cast<uint16_t>(0); // we're only looking for master node
        pollCount += static_cast<uint8_t>(network.write(header, 0, 0));
        contactNode[0] = 0; // might not be needed, but this is more explicit
    }
    else {
        //Find another radio, starting with level 1 multicast
        IF_MESH_DEBUG(printf_P(PSTR("%u: MSH Poll\n"), millis()));
        network.multicast(header, 0, 0, level);
    }

    uint32_t timr = millis();

    while (level) {

I haven't checked the message's complete flow of this change, but the main intention is to avoid letting RF24Network::_write() invoke a no-ack write when probing network level 0 with a NETWORK_POLL type message. This means we can't simply set the writeDirect parameter of RF24Network::write() to anything other than its default value.

2bndy5 commented 2 years ago

Above proposal won't work because the RF24Network::logicalToPhysicalAddress() function will direct the NETWORK_POLL message to the parent address 0444.

We need a way to write directly to a specified node address without disabling auto-ack when polling network level 0 (the master node).

TMRh20 commented 2 years ago

Using auto-ack when polling level 0 still creates a problem when multiple devices are requesting an address and even if it worked, I’m not sure that would fix the problem since the master node can still switch onto tx mode faster than the attiny nodes can switch to rx mode.

On Dec 5, 2021, at 5:50 PM, Brendan @.***> wrote:

 Above proposal won't work because the RF24Network::logicalToPhysicalAddress() function will direct the NETWORK_POLL message to the parent address 0444.

We need a way to write directly to a specified node address without disabling auto-ack when polling network level 0 (the master node).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

2bndy5 commented 2 years ago

Using auto-ack when polling level 0 still creates a problem when multiple devices are requesting an address

I didn't consider multiple nodes trying to connect simultaneously.

I’m not sure that would fix the problem since the master node can still switch onto tx mode faster than the attiny nodes can switch to rx mode.

You're right. This only aims to address the polling stage of the handshake; it doesn't address the address request/response stage.