NordicPlayground / nRF51-ble-bcast-mesh

Other
324 stars 121 forks source link

Pointers to data cached are recieved from other nodes? #140

Closed bayou9 closed 7 years ago

bayou9 commented 7 years ago

Hello, I hope I'm not mistaken, but I believe there are 2 caches in the RAM of a node, one is a "handle" cache which stores a pointer to another cache, namely the data cache.

This is what I think this mechanism should be implemented:

  1. A node passes a message to another node, inside the message are handle name, handle version, and data.
  2. The receiving node looks up the received handle in a table of some sort, then immediately compare the recieved handle version with the one it already has (the old one).
  3. If the received data has a new version, then the old data will be overwritten. Other wise, the received data will be discarded.
  4. How will the node know which data to overwrite? By looking up a data pointer stored in the handle cache. In the handle cache stores a data pointer, it points to a specific address inside the data cache, which stores the corresponding data.

This is where things get really, really confusing:

if (p_ble_evt->evt.gatts_evt.params.write.handle == m_mesh_service.ble_val_char_handles.value_handle)

If I'm not mistake, this is to compare recieved handle (send by another handle) with the existing handle (on the recieving node, of course), which will, in the end, decide what action to perform, here, it will be a write/update operation.

So far, things are quite understandable. But then we have this line:

mesh_gatt_evt_t* p_gatt_evt = (mesh_gatt_evt_t*) p_ble_evt->evt.gatts_evt.params.write.data;

I looked up the "p_ble_evt->evt.gatts_evt.params.write.data", it stores data received over gatt, which means it was received from somewhere else, and it was converted into a pointer of some sort, to actively seek out data locally! This is simply bizzare, because every node could have a single, unique memory setup, and you can't possibly receive a pointer from somewhere else and expect it to work locally?

I'm pretty sure I got it wrong, but then again, I can't seem to tell what the bolded line of code does, please help?

Also, what if the version exceedes 65535? Will a version "0" considered newer than version 65535?

trond-snekvik commented 7 years ago

Hi, Your 4 points are entirely correct, and a quite good summary of the mesh functionality.

The confusion comes from the mesh_gatt module, which inherits some naming from the Softdevice GATT server, that conflicts with the naming in the mesh. The ble_evt_t* parameter passed into the mesh_gatt_sd_ble_event_handle() function comes from the Softdevice, and uses the Softdevice naming. The handle and data coming through this parameter's gatts_evt refers to the Bluetooth GATT server's handle and data. The data referred to by the gatts_evt.params.write.data pointer is already stored inside the Softdevice, which is why we can use the pointer so nonchalantly. By the time we're seeing it, the Softdevice has already handled the transfer and stored the value. We're simply being notified that the value has changed in our GATT server. We cast it to the mesh_gatt_evt_t struct, which implements the GATT interface format described here. From this structure, we can derive the incoming mesh data and its mesh handle.

The wrap-around for the handle versions follows a lollipop scheme, as described here. When the version number wraps, it wraps to MESH_VALUE_LOLLIPOP_LIMIT, instead of 0.

bayou9 commented 7 years ago

Hello, thank you very much for your reply, I'm still digesting. Just another quick question, the "flooding" (communication between gateway nodes and gateway nodes) are done with GAP whereas the communication between a random (which is the point of a mesh network) gateway node and an external device (Phone, tablet, etc.) are done with GATT right?

Or it's more complicated, for example, every 2 nodes could be bonded, to have a more "stable" connection? (Which is quite unlikely because I didn't pick up any codes that implement that function, but I could be wrong)

trond-snekvik commented 7 years ago

That is correct. Any node can be a gateway, all mesh nodes get all the flooded values (by advertising packets). If you want to target specific nodes (say, a specific gateway on the other side of the network, or maybe a specific lightbulb), you have to implement this address filtering in the application. The recommended way of doing addressing is to use each handle as a source address (one per node), and put the target address in the beginning of your payload. The target address field could of course be the "source address" of the target node. Then, check every incoming packet for your own address, and only process packets that matches.

bayou9 commented 7 years ago

Hello, thank you for your timely reply! About "to use each handle as a source address", I was under the impression (as my understanding on the whole "handle" thing is fuzzy) that handles are only for internal (within one node, for this node only) reference, and may sometimes be only one byte? So how does it work since you are going to have tons of identical handles (based on my understanding)? I'm sure I got it wrong, maybe there's UUIDs involved? Sorry to bother you again.

trond-snekvik commented 7 years ago

The mesh handles are mesh-global. If a device in your network writes "hello" to handle 44, all other devices will get an event saying that handle 44's value now is "hello". The goal is always to have the same version/data of all handles in all devices.

The tricky part of using this for addressing is to distribute addresses, and avoid the problem of colliding addresses, as you're describing. This problem either requires a proactive solution (a central controller node distributing all addresses, for instance) or a reactive solution (just "take" an address when joining the network, and check that you don't have any collisions).

Happy to help :)

bayou9 commented 7 years ago

Thank you, you have been incredibly helpful, I don't want to sound like I'm taking shortcuts, but where are the (key) codes dealing with GAP/advertising implementation of the trickle flood mechanism? It looks to me that there are tons of codes dealing with GATT, as seen in

rbc_mesh_ble_evt_handler(p_ble_evt),

but not nearly as much in

nrf_adv_conn_evt_handler(p_ble_evt).

May be I got it completely wrong, the flooding utilizes the GATT part of the softdevice?

I swear this is the last question. Thank you again in advance.

trond-snekvik commented 7 years ago

The mesh communication is all implemented directly on the radio, as the Softdevice is not able to run the transmission patterns we want. The controller module is the transport.c, which commandeers the radio.c module. The version_handler is responsible for managing transmission timers, as well as parsing mesh packets.

The direction radio operation is made possible by the Softdevice Timeslot API, documented in ch. 9 of the Softdevice Specification (.pdf warning!).

In other words, we don't really use the Softdevice GAP events at all. The nrf_adv_conn.c module is part of the example, to demonstrate the mesh running with BLE connections, but is not a required piece of the mesh. In fact, with some minor adjustments, the mesh could just as well run without the Softdevice.

bayou9 commented 7 years ago

Thank you, I'll have to look right into it and try to have a better grasp of the situation.