bluerange-io / bluerange-mesh

BlueRange Mesh (formerly FruityMesh) - The first completely connection-based open source mesh on top of Bluetooth Low Energy (4.1/5.0 or higher)
https://bluerange.io/
Other
288 stars 109 forks source link

How to determine weak points of mesh #159

Closed maagedmazyek closed 3 years ago

maagedmazyek commented 3 years ago

Hi there,

we set up a mesh of ~80 nodes. Some of them have quite a distance in between (20meters). All in all it is satisfying, but the mesh breaks up too often. Since the re-building of the mesh costs a lot of time and energy (our modules run on battery), this is not perfect.

My hypothesis is that this is because of some weak connections only and If we know those spots, we could install extra "bridge"-nodes here and there to relax the situation.

Now it's hard to find those weak points. Of course you could just look around and make some assumptions based on physical factors like distance, ceiling, obstacles, etc. But my experience says your intution is not very good in estimating signal behaviours (RSSI).

So, the first idea was too look into response of the action <nodeID> status get_status -> "connectionLossCounter" attribute. But it seems that this values represents how often connections were re-established at this Node, but this is not the same statement as the "weak points". What also speaks for this is, that I have a very high number of connectionLossCounter for nodes, which are surrounded by many other nodes with very little distance. For nodes, which have quite a big distance to other nodes, the connectionLossCounter numnber is 5x lower. So this does not make a lot of sense for me.

I hope this is somehow understandable. Would love to get your thoughts on that.

mariusheil commented 3 years ago

Hi,

the connectionLossCounter is also increased during mesh setup and whenever connections are disconnected, also when disconnected on purpose. I added an internal suggestion to rename it. It is mostly used for the clustering algorithm.

To analyze problems in the mesh I would recommend that you make use of the logError or logCount methods in the Status Reporter Module and the Logger: https://www.bluerange.io/docs/fruitymesh/Logger.html#ErrorLog

For a list of errors that are already logged by us, see Logger.h. We usually prefix the error types with FATAL WARN, COUNT_ etc,...

Whenever a connection is disconnected, the connection knows its disconnect reason, which could be: disconnected on local side, remote side, timeout, etc,....

You could use logCount and log an error each time a timeout happens. Using "action 0 status get_errors", you can then query the accumulated log information of all nodes. The log will be cleared once you query it, so you might want to query it for each node individually to not cause too much traffic at once and risk dropping some of the errors.

This might help you to analyse the issue a bit better.

Marius

maagedmazyek commented 3 years ago

Hi @mariusheil - thanks for the quick answer, fast as always!

Great clue! Now I'm trying to navigate my way through the error-log-jungle. Is only "HCI_ERROR" interesting or the "CUSTOM" errors as well? Probably both, I'm busy now to create a good overview of all errors of all nodes to generate find the "weak nodes".

Feel free to close the issue