hardwario / lora-modem

Open LoRaWAN modem for Murata Type ABZ
BSD 3-Clause "New" or "Revised" License
14 stars 3 forks source link

Seamlessly handle physical reconfiguration (LoRa module swap) in Tower #72

Open janakj opened 2 years ago

janakj commented 2 years ago

In a physically-reconfigurable device such as those based on the Hardwario Tower kit, it would be advantageous to have the possibility to replace the LoRa modem module without the connection to the LoRaWAN network. In other words, the replacement LoRa modem module should be able to continue where the previous left off.

In order to support such reconfigurability, we would need to provide some mechanism through which the state stored in the LoRa modem's NVM could be exported to the Flash/EEPROM of the Core module. One way to accomplish this might be to implement a new +EVENT=... notification that could be enabled from the Core module and which would be emitted each time the LoRa modem updates its NVM. The data to be saved by the Core module could be directly included in the event.

The application on the Core module would then only need to subscribe to such notifications and save the data in Flash/EEPROM each time such notification is received. On boot, the Core module application would send any previously saved state to the LoRa module to re-initialize it. This would preserve the DevEUI, DevAddr, various encryption/authentication keys, and all counters across LoRa module replacements.

hubmartin commented 2 years ago

Hello Jan, the goal to move all the LoRa Module configuration to Core Module was discussed few times internally. Our CHESTER device works like that and every application starts configuration of the Murata module. I guess this might be easier solution instead of checking changed parameters. What do you think?

janakj commented 2 years ago

I don't see how this kind of re-configuration could be done using the existing AT command set, unless you factory-reset and re-Join the LoRa modem each time. How else would you reset uplink/downlink counters, for example? There is an AT command to get the current values, but they cannot be set.

Reconfiguring the modem might make little sense if the device has the modem hardwired, which I suppose is the case in Chester. But I often reconfigure my Tower devices while I run experiments and I never know which LoRa module belongs to which Core module. Many a time I ended up with the wrong payload in TTN for a particular device because I did not match the modules correctly.

Physical reconfigurability is one of my all time favorite Tower features. With this enhancement, I was hoping to get it to the point that you never have to worry which LoRa module is mated to which Core module. It would always continue working correctly, even if the device has already been joined to a LoRaWAN network.

hubmartin commented 2 years ago

On CHESTER there is a new JOIN made every time the device boots. You are also right that on CHESTER the modem is hardwired and there is some assurance which parameters firmware is changing and which is not.

I'm not familiar with all the low-level LoRaWAN stuff and which keys and counters need to be backed up. Does it mean that after every transmission the modem emits +EVENT with some internal variables/counters that Core Module saves to EEPROM/FLASH? And after reboot, the Core Module recovers those keys and counters every time? It seems quite complicated but maybe I don't understand the whole image now.

Wouldn't be enough that for example keys and configuration is sent after Core Module boots. Then some new "connection manager" in TOWER SDK tries to check the link with LNCHECK, if it fails, it does the JOIN and then you can send messages? SDK driver was first programmed for ABP, that's wh the JOIN and other stuff is left to the user instead of some automated logic.

Doing a factory reset every Core Module boot seems too complex for me. Wouldn't be enough that we check/reconfigure to defaults just the parameters the TOWER SDK uses and in case the user changes some special parameters, he will need to call AT$FACNEW manually? However, every BAND could have different default parameters and for all this, the Core Module must count on that.

This hardware Core Module - LoRa Module reconfiguration leads me to think that the most elegant solution would be that Murata Module will have only low-level RX/TX LoRa capability (some low-level timing radio stuff) and LoRaWAN with all keys, logic and counter could run inside Core Module and UART will be only transport interface. Not suggesting doing that, just my thoughts :)

janakj commented 2 years ago

On CHESTER there is a new JOIN made every time the device boots. You are also right that on CHESTER the modem is hardwired and there is some assurance which parameters firmware is changing and which is not.

I'm not familiar with all the low-level LoRaWAN stuff and which keys and counters need to be backed up. Does it mean that after every transmission the modem emits +EVENT with some internal variables/counters that Core Module saves to EEPROM/FLASH? And after reboot, the Core Module recovers those keys and counters every time? It seems quite complicated but maybe I don't understand the whole image now.

Yes, that was my original idea. I originally thought the LoRa module would just send a binary blob for the application to save, something like: +EVENT=10,1 dGhlIHF1aWNrIGJyb3duIGZveA==. Upon reboot, the application would simply send the same binary blob back to the LoRa modem to initialize it. No application-side processing.

Wouldn't be enough that for example keys and configuration is sent after Core Module boots. Then some new "connection manager" in TOWER SDK tries to check the link with LNCHECK, if it fails, it does the JOIN and then you can send messages? SDK driver was first programmed for ABP, that's wh the JOIN and other stuff is left to the user instead of some automated logic.

Actually, what you are proposing sounds like a better idea to me. We could remember just DevEUI, JoinEUI, AppKey, NwkKey, and DevNonce in the Core module. Upon boot, the SDK could detect whether the LoRa module mated to the Core module changed. If it did, It would re-initialize the modem with these remembered parameters (including the DevEUI) and perform an OTAA Join. After the Join, we would need to save the new DevNonce value in the Core module again, but that's a small price to pay.

The DevEUI and DevNonce values would be tied to the Core module, not to the LoRa module, so after a successful re-Join the device would still be able to communicate within the LoRa network. Everything, including existing payload formatters, would still work.

This approach won't work in the ABP mode, but that's IMHO not a problem. If the application uses the ABP mode, we can assume it knows what it is doing and should be able to handle module replacements on its own.

Doing a factory reset every Core Module boot seems too complex for me. Wouldn't be enough that we check/reconfigure to defaults just the parameters the TOWER SDK uses and in case the user changes some special parameters, he will need to call AT$FACNEW manually? However, every BAND could have different default parameters and for all this, the Core Module must count on that.

Doing an OTAA Join gets very close to doing a factory reset. A significant number of internal parameters, including the channel mask, are reset before the Join. Doing band change together with OTAA Join resets pretty much everything except platform-specific settings such as the serial port baud rate.

This hardware Core Module - LoRa Module reconfiguration leads me to think that the most elegant solution would be that Murata Module will have only low-level RX/TX LoRa capability (some low-level timing radio stuff) and LoRaWAN with all keys, logic and counter could run inside Core Module and UART will be only transport interface. Not suggesting doing that, just my thoughts :)

A full LoRaWAN MAC implementation is pretty big and complex. If you move this to the main MCU, there wouldn't be much space left for the application. That is, by the way, how many Arduino or ESP based projects implement it. I think the Tower approach where you have a separate MCU just for LoRaWAN MAC is much better. It provides a clean separation between the two applications which tends to improve reliability, in my experience. Furthermore having the source code for both parts is a win-win scenario.

janakj commented 2 years ago

(work in progress)

Here is a new (simpler) algorithm to handle physical system reconfiguration in Tower. The goal is to make it possible to mate any LoRa module to the Core module without disrupting the operation of the device. To the LoRaWAN network, the device should still appear as the original device, i.e., the identity of the device should be tied to the Core module and not to the LoRa module. Upon detecting a new LoRa module, the Core module will perform an OTAA Join to create a new network session. Thus, the Core module needs to save just enough information to be able to perform the Join with a new (previously unseen or unconfigured) LoRa module.

The Core module stores DevEUI, JoinEUI, AppKey, NwkKey, and DevNonce. Everything except DevNonce is static. DevNonce must be updated after each Join. The Core module also stores the serial number of the most recently mated LoRa module (will be made available via a new AT command).

Upon boot, Core module checks if the serial number of the mated LoRa module matches the stored value.

If different:

  1. Reset the LoRa module to factory defaults
  2. Switch band to the desired band
  3. Set DevEUI, JoinEUI, AppKey, NwkKey, and DevNonce in the LoRa module
  4. Perform OTAA Join
  5. Update DevNonce stored in the Core module

else:

  1. If the LoRa module's DevNonce is higher than the DevNonce in the Core module, store the higher value in the Core module.

After performing the above initialization sequence, the Tower device should be able to resume LoRaWAN communication under its original identity (DevEUI). The full initialization sequence is only necessary if serial number of the mated LoRa module changed. If it did not change, we still need to check if, by any chance, the DevNonce in the LoRa module has been updated as a result of, e.g., the user manually provisioning the LoRa module via the ATCI.

The above sequence should work under one condition: if the original LoRa module is connected to some other device, it must be wiped out before it performs another Join. Otherwise, one could end up with two duelling LoRa devices repeatedly performing Joins under the same DevEUI and destroying each other's session. This should not be a problem if the two devices are Tower devices and they both follow the above mentioned initialization procedure. However, if the original LoRa module is connected to some other device which relies on the modem's internal settings, the duelling scenario could happen.

To deal with the duelling scenario, we could encrypt AppKey and NwkKey and store the two values in the LoRa module encrypted. The Core module will then provide the decryption key as a parameter to AT+JOIN. With this extension, only the original Core module would be able to perform a Join under these root keys.