NordicPlayground / nRF51-ble-bcast-mesh

Other
323 stars 121 forks source link

How does the DFU work? #154

Open bayou9 opened 7 years ago

bayou9 commented 7 years ago

Hello, I hope I'm not asking a question well documented in this project, because I failed to find any, I spent most the time digging the application part of this project.

So how does it work? Can someone explain it to me?

Right now I'm thinking about 2 scenarios:

After I download the bootloader into the bootloader section of the MCU, I can call a function in the middle of RUNNING the application, then the MCU will enter this "bootloader mode", in which:

A. new firmware will be disintegrated into round(size_of_firmware/max_payload_size_of_mesh_packet)+1 pieces, and each one of these pieces will be broadcasted thoroughout the entire network, and nodes will have to put all these pieces together, combine it into a bin file, then do the upgrade itself, then reboot.

B. the firmware will be copied from one node to another, that is, A will give the new firmware, the bin file in its entirety to B, then A to C and B to D, then C to E and D to F and B to G... in a viral fashion.

I haven't gone through the bootloader program, and skipped all those parts, so can someone please tell me how does it work?

trond-snekvik commented 7 years ago

Your scenario A is correct. The bootloader will receive and relay the image at the same time, flooding the entire firmware across the network in 16 byte pieces. Each device will decide whether it needs the given firmware for itself, and store it in a "bank" (some available space, decided by the application), while forwarding it to its neighbors. At the end of the transfer, the firmware is read from the bank, and replaces the application.

The bootloader can work alone (in case the application is broken or not present), or it can be called from the application (while it's running regular mesh operation), so that the firmware transfer happens in the background. The rbc_mesh API provides a set of events that will notify the application of any upcoming transfers, and the application can start participating by calling a set of functions.

The BLE Gateway example interacts with the DFU API, and should cover most normal use-cases.

Note that the DFU feature is still in active development, and performance is still being improved upon.

bayou9 commented 7 years ago

Trond, once again thank you very, very much, I'll dig in deeper.

twvd commented 7 years ago

What happens if a node misses one chunk of firmware? Integrity checks aside, is there a mechanism that makes the node actively request the missing chunks, for instance?

Does all this happen in the same register space as the mesh, or is it something seperate?

trond-snekvik commented 7 years ago

Yes, once a device misses an entry, it'll get marked in the missing_segments bitmap. This bitmap is polled every time we receive a new packet, and we will start beaconing a data request packet with the oldest missing segment with regular intervals. If any of our neighbors get that beacon, they'll either send a data response packet containing the data for the given segment, or they'll be missing that segment too, which means that they'll eventually be beaconing a request for the same packet. The data response will jump from device to device as a group of devices that missed a segment recovers.

You can see the packet format and types in this diagram.

twvd commented 7 years ago

Thanks for the clarification, this is very useful info.