libremesh / lime-packages

LibreMesh packages configuring OpenWrt for wireless mesh networking
https://libremesh.org/
GNU Affero General Public License v3.0
281 stars 96 forks source link

Mesh wide firmware upgrade #1031

Open selankon opened 1 year ago

selankon commented 1 year ago

In the context to create a mesh wide firmware upgrade system we faced up some doubts and problems that I'll try to expose here:

Context

Develop a mesh wide firmware upgrade system. This system will rely on an interface where the user can manage the firmware upgrade for all the mesh and execute it at same time for the whole network.

Dilemma, sync or not to sync

The methodology to implement this system balance between the responsibility of the shared-state on the whole system, making all the process from less synchronous to more synchronous depending on the role that shared-state plays on the whole process. It can span from:

Some of the states a node can have during the process are: UPDATED, UPGRADE_AVAILABLE, DOWNLOADING, UPGRADE_READY, UPGRADE_SCHEDULED. A error system handling system have to be implemented, some possible error states: DOWNLOAD_ERROR, UPGRADE_CHECK_FAILED, INTERNAL_ERROR (when is not possible to perform an action, could be an unexpected error related to permissions, or too outdated nodes that don't have the needed methods etc...)

An example of information shared between nodes could be:

{
    "state": "UPGRADE_AVAILABLE",
    "new_version_info": "LibreRouterOs_1.5",
    "safe_upgrade": true,
    "downloaded": false, // Is already downloaded
    "downloading": false, // Is downloading the firmware
    "firmware_check": false, // If is downloaded, check the download is upgradable (check the firmware with sysupgrade)
    "scheduled": 30000, // Ms to perform the upgrade. If -1 means not set (?)
}

Some thoughts

Reling to a fully asynchronous system something so critical as the firmware upgrade is can be problematic. However, shared-state could have the role to share the information between the nodes during a firmware upgrade process, even boosting shared-state timer to sync all the information faster than usually when a flag is set up on the same shared-state. Something like mesh-state: firmware_upgrade would increase the refreshing time of the shared state. Is just an idea to explore.

An advantage to use shared-state instead of a for loop to retrieve the info or either perform some actions, is, IMHO, that can be easily implementable and consistent. A RPC for-loop jumping from node to node seems to be buggie, breakable and slow, deteriorating user experience.

A mixed system could be an elegant and consistent solution for this implementation.

ilario commented 1 year ago

Still have to properly read this all but surely we should use the https://openwrt.org/docs/guide-user/installation/attended.sysupgrade by @aparcar and also set up a https://firmware-selector.openwrt.org/ server for LibreMesh, both for downloading images and for generating the updates.

selankon commented 1 year ago

A proposal for a mocked ui for the limeapp can be found here meanwhile we decide how to proceed.

https://github.com/selankon/lime-app/tree/f/mesh-upgrade

image

G10h4ck commented 1 year ago

We have talked about notifying the other routers about the UPGRADE REQUESTED status via shared-state (TODO decide if multiwriter or not), the URL from where to download the new firmware will be shared too in the same message, then eupgrade should be used to download, check and flash the firmware once every node in the reference status confirm is is ready to upgrade, and the user confirm the upgrade.

This is of course a tentative strategy, we need to implement it and see how it behaves to improve on top of it or elaborate another one.