nRF24 / RF24Mesh

OSI Layer 7 Mesh Networking for RF24Network & nrf24L01+ & nrf52x devices
http://nrf24.github.io/RF24Mesh
GNU General Public License v2.0
421 stars 154 forks source link

Question: static address assignments in mesh topology #220

Closed svdasein closed 1 year ago

svdasein commented 1 year ago

Hi - love these libraries!

I've been puzzling hard on this and I keep hitting walls. I know that you've said in the past that mesh address assignment is dynamic and there's no logic to keep that from happening. You've also said it's possible to construct a hybrid network with both statically and dynamically assigned addresses.

Would you please explain how one would do this?

I've been toying around with the notion of having a isStatic property to the addrListStruct and somehow having that guide address assignment logic, but I'm beginning to think that I'm grossly misunderstanding how mesh address assignment works. Am I right in thinking that if a node gets to a point where it needs to request a new address, it's almost certain that it'll be asking for that address from a node other than the one it previously routed through. Which got me thinking that maybe what you're talking about is having some nodes that just set up an address when they start. But if that's the case, how can mesh and non-mesh co-exist? Isn't there a race as to which mechanism claims an address first?

2bndy5 commented 1 year ago

I detailed the connection process in the topology doc for the pyRF24 pkg (uses same C++ code under the hood); maybe that could help understanding the address assignment process (if not I'd like to know how/where I can improve the doc).

You can only set static addresses to mesh IDs from the master node using setAddress(), but I think that function was really meant to allow RF24Network nodes to be associated with a mesh ID number.

As for keeping track of which addresses are statically assigned, I think that is better served in the application code. Much like using static IP addresses in a (W)LAN, the network administrator must devise their own way of knowing which address is assigned to which device.

Am I right in thinking that if a node gets to a point where it needs to request a new address, it's almost certain that it'll be asking for that address from a node other than the one it previously routed through

It depends on the context.

Remember how the network levels are structured; child nodes are also at the mercy of their parents' connection, not just their own connection to the parent.

Isn't there a race as to which mechanism claims an address first?

Yes... The mesh master node doesn't impose a time limit on assigned addresses, and you can't assign all your static addresses before calling mesh.begin() since the addrList array is allocated in RF24Mesh::begin(). Although, @TMRh20 it would be possible to allocate the addrList array in the c'tor (instead of RF24Mesh::begin()) to give priority to static address assignment. EDIT: You can assign the static addresses after calling RF24Mesh::begin() and before any call to RF24Mesh::update().

If your mesh master node is running on Linux, then loadDHCP() (also invoked by RF24Mesh::begin()) can be used to load statically assigned addresses from the binary file named dhcplist.txt located in the same directory as your app's executable. Note that the binary data structure in dhcplist.txt uses four bytes per element in addrList, not three (I can explain that more if needed).

TMRh20 commented 1 year ago

You can only set static addresses to mesh IDs from the master node using setAddress(), but I think that function was really meant to allow RF24Network nodes to be associated with a mesh ID number.

Yup, a hybrid network would consist of RF24Network and RF24Mesh nodes. You can statically assign addresses for the RF24Network nodes using this function.

Although, @TMRh20 it would be possible to allocate the addrList array in the c'tor

Maybe thats something we would have to think about changing.

@svdasein You can also take a look at https://nrf24.github.io/RF24Mesh/md_docs_2general__usage.html

svdasein commented 1 year ago

@2bndy5 and @TMRh20 - thanks for all that. Brendan that python doc is excellent.

So @TMRh20 - I have tried using that method to wire up fixed addresses, but they don't "stick" - like it's not really a static assignment - it's more sort of an initial one. The thing I was thinking was to make it possible to do the same sort of thing you can do in https://www.rfc-editor.org/rfc/rfc2131 as regards "manual allocation" (dhcp can be configured at the server to always give out a given address (and other params) to a given mac addr).

In the case of RF24Mesh I was hoping I could force the assignment of an address to a node id regardless of which layer/node the request comes from --- so the node always attempts to use the same route. It looks like you write dhcplist.txt (on a pi) whenever a call to setAddress is made. But: the DHCP() method also calls setAddress, so DHCP may well undo those "static" assignments - that's why I was thinking about having an isStatic property -- so that the DHCP logic would know not to re-assign it to anything else and would know to always assign it to the given node.

This approach is more attractive to me cuz that keeps network configuration centralized on the pi - I don't have to re-flash anything to change the layout of the network - I just shut it down long enough for everything to time out and; after startup it's all in place.

Does that sound possible given the way mesh works?

I realize (now) that I may have a problem with the master ( a raspi) replying too quickly and the client node deciding that it timed out (and btw I noticed that somewhere in there you're doing cpu speed detection and adding a delay - is that suppose to solve this problem automagically?) . If however the master had a notion that it should just always associate a given node id w/ a given address - problem solved I think.

2bndy5 commented 1 year ago

Forgive me, its starting to feel like I'm intruding on a conversation in which I wasn't invited. But, I'm trying to understand the situation where you seem to describe undesirable behavior:

I have tried using that method to wire up fixed addresses, but they don't "stick"

I'm having trouble imagining a case where this would be the result. I believe the statically assigned address should imply (via topology) that the route-to-master only consists of statically addressed nodes as well. In which case, a hybrid network can only use dynamic addresses for children of statically assigned nodes.

In the case of RF24Mesh I was hoping I could force the assignment of an address to a node id regardless of which layer/node the request comes from --- so the node always attempts to use the same route.

This doesn't seem like a design goal of the mesh layer. Rather, this is the intent of using the RF24Network nodes in a mesh layer.

If what you suggest is meant to limit DHCP() response to address requests from a certain mesh ID, then you should be using a RF24Network node instead because there isn't any need to process address requests from a RF24Network node (effectively bypassing DHCP() entirely). TBH, it would increase the compiled size (not preferable) for the mesh layer if this functionality is implemented in DHCP().


somewhere in there you're doing cpu speed detection and adding a delay

I'm not sure what exactly this refers to. It sounds like what RF24 does when deciding how long to wait for de-bouncing the CSN pin upon assertion. There was a dev artifact that was removed and later added back as an optional compiler define (-D): https://github.com/nRF24/RF24Mesh/blob/780fa0872de11562a0de7e5cd38a977c1731d666/RF24Mesh.cpp#L554-L557

svdasein commented 1 year ago

@2bndy5 my apologies - I absolutely did not mean to single out either of you - I meant that comment for both of you equally.

The CPU speed stuff I was referring to is the stuff in RF24 that deals with F_CPU - in particular in the startWrite method. Looking again I'm guessing that's got to do with bus timing, so my comment was nonsense. I'll try the SLOW_ADDR_POLL_RESPONSE thing & see if that helps.

I'm struggling a bit to understand why you say you have trouble imagining a case where a static assignment would change. Couldn't it change if, for instance, the client node missed a NETWORK_ADDR_RESPONSE? I thought a missed response triggers behavior in the client that is more or less "ok start from scratch". Is that right?

Also - aside from the fact that the client node should try talking through the route it was last given - if there's some interference that temporarily renders that route unusable, there's nothing about the assignment on the master side that tells it it should not re-formulate an address for that node if it requests one is there?

I seems like maybe I'm missing something fundamental here yet - if you can set me straight on why an address change is unlikely after a setAddress call I'm eager to learn.

This is probably all moot though cuz it sounds like the idea of a permanent address assignment runs counter to the idea behind the mesh. The DHCP() method had me thinking a bit too literally there maybe.

If I personally am ok with a larger binary, am I right in thinking that if I want to try to implement this static notion on my own, most of the action will be in the DHCP() method?

Thanks again

2bndy5 commented 1 year ago

Couldn't it change if, for instance, the client node missed a NETWORK_ADDR_RESPONSE? I thought a missed response triggers behavior in the client that is more or less "ok start from scratch". Is that right?

This is true for a mesh node. My understanding of using setAddress() to assign static addresses is really only meant for RF24Network nodes. I would expect an assigned address for a mesh node (static or not) to change as that is the nature of the mesh layer.

if I want to try to implement this static notion on my own, most of the action will be in the DHCP() method?

It sounds like you are familiar enough with the code to implement this in your own fork. So, yes the addrListStruct would need an attribute to indicate that the address assigned is static. And yes to utilizing the new attribute in DHCP().

I'm not sure how you'd persist the new attribute... The dhcplist.txt uses 4 bytes per element in the addrList array. This is because the binary representation is aligned to 4 bytes. So, in theory, you could fit the new attrubute (assuming it is a boolean) in the second byte of the binary structure (for each element in addrList):

byte 0 byte 1 byte 2-3
mesh id boolean attribute logical address

Currently byte 1 is just garbage that we don't use, so this change should still be compatible with upstream implementation of dhcplist.txt.

It might be possible to use the address' MSBit instead of a new attribute, but that would certainly cause problems when the reserved address (including the MSBit flag) is passed to RF24Network::is_address_valid(). Nonetheless, the addresses' top 4 MSBits remain unused while we only support 6 data pipes (a slight reference to https://github.com/nRF24/RF24Network/issues/201).

svdasein commented 1 year ago

Ok thanks both of you for your time. Closing

2bndy5 commented 1 year ago

I added suggestion in my last post about how to store the new attribute in dhcplist.txt (requires changes to save/loadDHCP()).

svdasein commented 1 year ago

I'm actually getting pretty close with this - it's mostly working. If you have a moment, can you explain what the function of from_node is in a header?

2bndy5 commented 1 year ago

It is part of the RF24NetworkHeader docs: from_node.

In address requests, I think this is the lowest level routing node that responded to the connecting node's polls (as the first step in the requestAddress() process). ~Although I'm having trouble tracking this down in the code.~ Found it in the RF24Network code: https://github.com/nRF24/RF24Network/blob/2a7c942a66e10e3b7d0b09cd58c7b1a6427ff1b1/RF24Network.cpp#L193-L196

2bndy5 commented 1 year ago

If you have a moment, can you explain what the function of from_node is in a header?

To be clear, the from_node is used to generate the address assigned as it has to be a child of the routing node (if any).

svdasein commented 1 year ago

Ah ok - so where it came from informs the mesh address generation logic as to the inner layers (if any) that the request came from - so that it (address logic) can make the assumption that this is the best RF return path not only for the node asking the question, but for the master's ability to get a reply back to it. Do I have that correct?

So - with this new static stuff I'm short-circuiting the address logic in the case of static assignments; I just want to return the same address that node has always had. So - here's the basis for my question: image In (1) I'm pretty sure what's going on there is the master is replying directly to the node. In (2) it's replying through another node. In the act of short circuiting the address logic, I'm specifically choosing to NOT use the from_node as part of the address I want the inquring node to use -- I want to tell it that regardless of where this address response is coming back from, THIS is the address (and route) I want you to use.

So - the question: does the fact that I'm replying to from_node in (2) in any way affect the address that I ask the original inquiring node to adopt with my address reply? Or is that really just kind of a route to get back to that node?

I know it seems insane, but it's actually helping quite a bit for the few nodes that I've used it on. I think I need some clarity on that point is all.

2bndy5 commented 1 year ago

You're understanding is mostly correct. I would've checked in the addrList to make sure the mesh ID has an assigned address and respond with the static address if the static address contains the from_node address in its LSBytes (maybe that's along the lines of what this isStatic() does?). If the connecting node used the wrong routing node for the static address, then the master should return an invalid address (like 060).

So - the question: does the fact that I'm replying to from_node in (2) in any way affect the address that I ask the original inquiring node to adopt with my address reply? Or is that really just kind of a route to get back to that node?

The routing node does not adopt a new address. It is simply used to forward the request/response between the master and connecting node. I'm not sure if I'm understanding that question right.

svdasein commented 1 year ago

I have the first part covered; isStatic(nodeID) would only return true if the address already existed in addrList and was also marked as static -- in which case there's already an address there.

On the second part - to paraphrase - the routing node does not modify the address returned by the master, right? (The routing node does not care - just pass-through)?

2bndy5 commented 1 year ago

Yes to "just pass-through".

My main point was to be sure that the connecting node isn't given the static address if the route does not correspond with the static address. If you give the connecting node an address that doesn't use the actual route the request took to master, then undefined behavior may occur after the connecting node adopts the static address -- meaning the connecting node could orphan itself when using an unexpected route-to-master.

svdasein commented 1 year ago

Got it. Ok - thanks - closing again (oh - it's already closed)

svdasein commented 1 year ago

So - just a little follow up - you will probably chuckle.

I got that whole static addrs thing dialed in, and while it sorta helped the issues I was seeing it was not the magic bullet I was hoping for.

So then I got this other idea - maybe I'm just being too damn chatty on the network. So I re-coded all the nodes to be "mostly quiet" unless they have something important to say (I had been doing something other than that). "Oh. It's all good now".

So basically - you really wanna shoot for seconds per packet rather than packets per second.

2bndy5 commented 1 year ago

sounds about right 👍🏼

FWIW, you picked up the mesh mechanics pretty quickly. It took me weeks of review the code from both net layers to fully understand it all. Afterward, I added a lot of explanatory comments to the source and wrote that super basic topology page, so others could gain an understanding faster than I did.

Avamander commented 1 year ago

It's certainly possible to do many many messages per second, the difficult part is coordinating it. I implemented time synchronization (and also message integrity) and started doing very rudimentary time-slotting, tremendously more reliable.

Generally though, great way to rediscover why other protocols (e.g. even WiFi) have been built like they have.