nRF24 / RF24Mesh

OSI Layer 7 Mesh Networking for RF24Network & nrf24L01+ & nrf52x devices
http://nrf24.github.io/RF24Mesh
GNU General Public License v2.0
422 stars 154 forks source link

Mesh needs a Node gone missing notification #186

Closed FlailAway closed 3 years ago

FlailAway commented 3 years ago

Hi, just getting started with Mesh (been using RF24Network for years) and so far so good. But, the AddressList needs to know if a Node has gone missing.

I can handle that with ReleaseAddress() if I can control that Node. But a power outage for one Node cannot get back to the base, so the list stays populated with some existing Nodes and some, not so.

How can I have the Base (00) refresh that List periodically to keep it current for all Nodes that are actually out there?

ETA: What about a bool NodeExists(uint8_t aNodeID) {

Thanks

Avamander commented 3 years ago

How can I have the Base (00) refresh that List periodically to keep it current for all Nodes that are actually out there?

You can implement the functionality in your own code. Make a liveliness check of some sort for example.

FlailAway commented 3 years ago

You can implement the functionality in your own code. Make a liveliness check of some sort for example.

I know that. But it is an an important Function that is missing from the Library. For the Base to continue running with AddressList incorrect, it is not a good thing for a Library.

The "How can I" was rhetorical and intended for TMRH20 understand the need for it as part of the Library. Otherwise everyone who uses the Library, at one time or another is going to run foul of the invalid data in AddressList. That list is useles if the user has to create their own because the current list cannot be trusted to be correct.

Avamander commented 3 years ago

But it is an an important Function that is missing from the Library.

It's not a trivial problem to solve, there isn't an universal solution. You can try of course, but it's unlikely to be universal. I don't see a built-in solution happening any time soon.

If you think its trivial, ask yourself, is it solved in IP-based networks? If you said yes, think for a second if an IP or a MAC address means you'll have connectivity to the machine, even on small local networks.

FlailAway commented 3 years ago

It's not a trivial problem to solve,

Ummm, Ping each Node in the list and any that do not respond within a reasonable time, delete them from the list.

Wow, that is difficult. :)

I have just done that and it is working fine.

IP-based networks?

Yup, "ipconfig" or "ifconfig" depending on the OS, netmap and arp-scan for Linux, should I go on?

2bndy5 commented 3 years ago

@Avamander I wouldn't mind this feature if it can be toggled with a macro definition, maybe something like

#if defined (MESH_CHECK_LIVELYHOOD)
// do requested feature ...
#endif

That way someone could still implement their own solution when MESH_CHECK_LIVELYHOOD is undefined.

Ummm, Ping each Node in the list and any that do not respond within a reasonable time, delete them from the list.

Wow, that is difficult. :)

@FlailAway does that mean we can expect a PR from you? And please don't be nasty. Its open-source software for a reason; we would love to get contributions from people like you with ideas - just don't expect us to do the work for you (we're the ones that get stuck maintaining it after all).

Avamander commented 3 years ago

I thought I responded, but I guess not and generally put, "does not respond within a reasonable time" is very much undefined. Many problems stem from that, and I would like that a simple liveliness check solved them, but it doesn't, not really.

Yup, "ipconfig" or "ifconfig" depending on the OS, netmap and arp-scan for Linux, should I go on?

As I've mentioned before, your local state of the network stack does not mean the end device is online. Your connection might've dropped a second after you ran ip a and your ARP cache might contain entries for things that are offline. Should we go on?

I'm sure you're also aware that nmap and etc. are not built-in to any OS or the IPv4/IPv6 standards themselves? Do you constantly scan the web or your local network to know hosts are up :D before sending them anything?

In the end, reliable transport is something you can build on top, like on other networks. A separate OSI layer.

You are free to try though.

2bndy5 commented 3 years ago

In the end, reliable transport is something you can build on top, like on other networks. A separate OSI layer.

Thank you for a more detailed explanation. I'm not a networking expert, but it sounds like a proper solution to this is not feasible on the RF24Mesh level.