bluerange-io / bluerange-mesh

BlueRange Mesh (formerly FruityMesh) - The first completely connection-based open source mesh on top of Bluetooth Low Energy (4.1/5.0 or higher)
https://bluerange.io/
Other
287 stars 109 forks source link

Mesh cannot connect #89

Closed asoftplus closed 7 years ago

asoftplus commented 7 years ago

According to the Quick Start Guide, after flashing the program to all the nodes, they should be able to connect to each other automatically.

However, my nodes are not connected. The clusterSize always equals 1. And when I tried to pingmod data from one node to another, nothing happens. Also, the LEDs are always red.

I am not sure how to proceed. The closed hint I can get is the intro of the FruityMesh Algorithm page:

There is one big restriction when it comes to BLE connections. With the S130 SoftDevice from Nordic, it is possible to manage three connections as Central and one connection as Peripheral. This will lead to problems where two nodes cannot connect to each other because their one connection as a Peripheral is already taken.

But with this hint, I am still not sure how to solve my problem.

Do you have any further hints?

mariusheil commented 7 years ago

Hi, the quote is something different. Type "status" in every node and check if the networkid is the same on all nodes. You can also check with "bufferstat" if the nodes receive the discovery packets of their neighbours. If you type "debug all" you can see everything that the nodes are doing, whether they are connection, etc,...

asoftplus commented 7 years ago

Thanks a lot for your info.

Using "bufferstat", I have found that each node can see other 2 nodes. (I have totally 3 nodes.)

However, when I try to send data from one board to another, only the sending board itself can receive the data.

I have tried "pingmod" and "data board". Both commands got this behavior.

I guess I have missed some important initial settings. But I am not sure what they are.

mariusheil commented 7 years ago

Hi,

if they can see each other, that does not yet mean they are connected. You can see the connections when entering status. If there are none, the node is not connected. Have you checked the networkids as I suggested? You should read some more in the wiki about the commands and especially the debug commands. Try to find the parts in the code where a connection is made and check with the debug terminal what happens.

asoftplus commented 7 years ago

if they can see each other, that does not yet mean they are connected. You can see the connections when entering status. If there are none, the node is not connected.

When I used "bufferstat", I got this:

bufferstat JOIN_ME Buffer: => 0, clusterId:0, clusterSize:0, freeIn:0, freeOut:0, writeHandle:0, ack:0, rssi:0 ADV_IND => 0, clusterId:0, clusterSize:0, freeIn:0, freeOut:0, writeHandle:0, ack:0, rssi:0 ADV_IND => 0, clusterId:0, clusterSize:0, freeIn:0, freeOut:0, writeHandle:0, ack:0, rssi:0 ADV_IND => 2242, clusterId:222108c2, clusterSize:1, freeIn:1, freeOut:3, writeHandle:14, ack:0, rssi:-44 ADV_IND => 13347, clusterId:cde23423, clusterSize:1, freeIn:1, freeOut:3, writeHandle:14, ack:0, rssi:-57 ADV_IND

When I used just "status", I got this:

I got this:

CONNECTIONS (freeIn:1, freeOut:3, pendingPackets:0 IN 0, state:0, clId:0, clSize:0, toSink:-1, Queue:0-0(0), Buf(rel:1, unrel:0), mb:0, pend:0 OUT 0, state:0, clId:0, clSize:0, toSink:-1, Queue:0-0(0), Buf(rel:1, unrel:0), mb:0, pend:0 OUT 0, state:0, clId:0, clSize:0, toSink:-1, Queue:0-0(0), Buf(rel:1, unrel:0), mb:0, pend:0 OUT 0, state:0, clId:0, clSize:0, toSink:-1, Queue:0-0(0), Buf(rel:1, unrel:0), mb:0, pend:0

I think this means that there are no connections.

Have you checked the networkids as I suggested?

Yes. They are all the same:

networkId:1

(However, there clusterIds are not the same.)

You should read some more in the wiki about the commands and especially the debug commands. Try to find the parts in the code where a connection is made and check with the debug terminal what happens.

By using debugall, I got this:

[Node.cpp@875 DISCOVERY]: JOIN_ME: sender:1501, clusterId:3d1a05dd, clusterSize:1, freeIn:1, freeOut:3, ack:0 [Node.cpp@914 DISCOVERY]: Updated old buffer packet [Main.cpp@290 EVENTS]: End of event [Main.cpp@266 EVENTS]: BLE EVENT BLE_GAP_EVT_ADV_REPORT (29) [ScanningModule.cpp@353 SCAN]: Packet filtered, rssi:-59, dataLength:31 addr:CA:90:B8:E0:29:14 [Main.cpp@290 EVENTS]: End of event [Main.cpp@266 EVENTS]: BLE EVENT BLE_GAP_EVT_ADV_REPORT (29) [ScanningModule.cpp@353 SCAN]: Packet filtered, rssi:-67, dataLength:31 addr:CA:90:B8:E0:29:14 [Main.cpp@290 EVENTS]: End of event [Main.cpp@266 EVENTS]: BLE EVENT BLE_GAP_EVT_ADV_REPORT (29) [Node.cpp@875 DISCOVERY]: JOIN_ME: sender:1501, clusterId:3d1a05dd, clusterSize:1, freeIn:1, freeOut:3, ack:0 [Node.cpp@914 DISCOVERY]: Updated old buffer packet [Main.cpp@290 EVENTS]: End of event [Main.cpp@266 EVENTS]: BLE EVENT BLE_GAP_EVT_ADV_REPORT (29) [ScanningModule.cpp@353 SCAN]: Packet filtered, rssi:-67, dataLength:31 addr:CA:90:B8:E0:29:14 [Main.cpp@290 EVENTS]: End of event [Main.cpp@266 EVENTS]: BLE EVENT BLE_GAP_EVT_ADV_REPORT (29) [ScanningModule.cpp@353 SCAN]: Packet filtered, rssi:-73, dataLength:31 addr:CA:90:B8:E0:29:14 [Main.cpp@290 EVENTS]: End of event [Main.cpp@266 EVENTS]: BLE EVENT BLE_GAP_EVT_ADV_REPORT (29) [Node.cpp@875 DISCOVERY]: JOIN_ME: sender:13347, clusterId:cde23423, clusterSize:1, freeIn:1, freeOut:3, ack:0 [Node.cpp@914 DISCOVERY]: Updated old buffer packet [Main.cpp@290 EVENTS]: End of event [Main.cpp@266 EVENTS]: BLE EVENT BLE_GAP_EVT_ADV_REPORT (29) [ScanningModule.cpp@353 SCAN]: Packet filtered, rssi:-67, dataLength:31 addr:CA:90:B8:E0:29:14 [Main.cpp@290 EVENTS]: End of event

This pattern kept going, without other kinds of activities. I am not sure what it means. But the log did have the id of other nodes, such as:

[Node.cpp@875 DISCOVERY]: JOIN_ME: sender:13347, clusterId:cde23423, clusterSize:1, freeIn:1, freeOut:3, ack:0

Any suggestions?

mariusheil commented 7 years ago

Mhh, if that's the only output, than there's something wrong. There should be state changes. Basically what the above log tells you is that the other nodes are sending discovery messages and these are received and saved correctly, but your node does not try to connect to the other nodes. Different clusterId is correct, as they are not connected. Can you type reset and immediately debug all and send the log? Also, could you send your compiled firmware? For which board did you compile? Did you use the latest master from the repository?

asoftplus commented 7 years ago

I have added my own code to the program. I suspected that although my code is totally not mesh related, the addition timer I added may interfere the working of the mesh.

So I tried to flash the provided hex to the broads.

However, strange things happened:

Just after I flashed the hex into a chip, the mesh was connected immediately, even though I have not flashed the provided hex into the other two chips yet. They were stilling running my own hex.

So I would like to ask:

Which does that mean?

Which of the following possibilities is more likely?

  1. It mean that my own code probably has problems, since by downloading the unmodified hex to one of the chip, the mesh started to work.

I should debug my code line by line.

  1. My code does not have problem, since 2 of 3 chips are stilling running it and the mesh is connected. The problem is in elsewhere.

  2. My code should be almost fine. All I need to do is to change some minor parts, such as start the mesh manually.

mariusheil commented 7 years ago

The one chip you flashed with the provided hex was doing all the work to build the mesh. The code you modified probably messed up the timer so that the node could not do timer related work such as connecting and deciding to connect to others. But if a connection comes in, the logic will nevertheless run and do all the mesh work. So, you should probably start by using the provided code, check if everything is running. Then add your own changes bit by bit and see if its still running. Have a look at some of the modules and especially how they are using the timer. See if you can use the provided timer instead of implementing your own.

asoftplus commented 7 years ago

Thanks for your quick replies and detailed suggestions. I will follow them.

Here are my further experimental results:

  1. When I had flashed back my code into the mesh-building chip, the mesh still worked, although all the 3 chips were running my own program.

  2. When I had turned off the mesh-building chip, the mesh (with 2 nodes then) still worked.