Open ajcraig opened 8 hours ago
@stormc reply This is about device update, i.e., updating a device's "firmware", right? (edited) [4:32 AM] If so, we as margo are now in the "app domain" and we shouldn't do device management -- but we should interface, meaning: From the "margo" domain, we should be able to trigger / initiate certain device management functions, like, e.g., firmware update. This requires a bi-directional communication channel between the app/margo domain and the device's management implementation. For firmware update this means, for example: (1) margo domain can query device management domain about last status of firmware notification polling (to show this in dashboards, ...) (2) margo domain can trigger looking for new firmware (3) margo domain can -- if a new firmware is to be installed -- defer the installation (to a maintenance window) ...
@ajcraig reply Correct, I see this as any configuration change to a device including firmware updates, network configurations, container runtime changes, BIOS updates, etc. The Device orchestration service would enable the user to kick off the update, by defining the new desired state, and then rely on private implementations to complete the deployment. Private implementations include: Device Owner/Manufacturer notification service to inform the 3rd party orchestrator that there is an update available for a device the End-User owns. Device Owner/Manufacturer firmware/file repository available to be pulled from via the device. Device Owner/Manufacturer "deployment service" residing on the device to apply the update.
I also envision a trust establishment process that is required between the End User, 3rd party device orchestrator, and Device Owner/Manufacturer services. Below is a crude drawing depicting what I described above.
I think where we differ in opinion is whether the Device Orchestration "domain" should be in scope for Margo. I was under the impression we would have both Workload and Device orchestration services that could become Margo compliant.
@stormc reply I think we're quite on the same page I'd like to see an interface specified by margo that is implemented such that it calls out to existing (probably proprietary) device management functionality on the device -- this is not to be coded by margo, it's there, we "just" need to "bridge" to it. So, I'd rather see not margo implementing, e.g., applying a firmware update, but calling out to an existing firmware update agent on the device. This interface (you: device orchestration service / DOS) is part of the Margo specification and defines all device management functionality we want to trigger/consume. The implementation(s) of this interface vary and call out to the Rockwell or Siemens or ... implementation.
Just to illustrate this a bit, this is the "margo domain" with a User initiating a "Update FW" action: User: "Update FW" -> WOS -> DOS |IPC:send|
This is the "device management domain" that receives this call and does according actions with different implementation that all react to the action called above: |IPC:receive| SIEMENS Implementation -> notify |IPC:receive| Rockwell Implementation -> notify |IPC:receive| ENOTIMPLEMENTED -> notify
That said, it's not just IPC but also notification.
@tomcounihan reply Its an interesting system arch conversation here. My 2cents. IMHO, WOS should not have any API that knows/interacts with FW lifecycle management. As a microservice, it should only deal with App Lifecycle. I do think there is a 'missing' microservice - the orchestrator of orchestrators. Who, using the FW example, would coordinate what needs to happen. So in this instance, it might need to quiesce apps on targeted devices (perhaps migrate if that makes sense). Once it is happy that the apps are taken care of, the it goes on to the infrastructure manager (I think Device Manager same thing) , who does it magic on FW, that may be inband or out of band, But likely requiring a reboot. After reboot it reestablishes the apps (which might also need to be aware of the APP Framework might have moved on a version k8s 1.27-1.28 etc). I guess I see it in layers Orch^n (where n= number of layers) App Orch (WOS) App Framework Orch (think upgrading K8s/Docker) Infrastructure Orch (think distro upgrade, but also FW) Security Orch( keys, certs, etc) - albeit this may be a pillar sitting beside the above.
@stormc reply Sure, there is some intertwining between device management (system's domain) and application management (margo's domain). As you pointed out, margo probably should be able to defer a pending firmware update (think: maintenance window). For this, we need to have a mutual information exchange and some kind of (prohibiting) control exercised over the system domain, i.e., wait until the app is done manufacturing this workpiece. From a customer's perspective the device is functioning if apps are running, so apps' wishes have to be respected by the device management, at least to some degree.
@pauldbrooks reply Avoiding technical input, but I want to draw a distinction between in-scope and mandatory. My understanding is that device orchestration should be in scope of Margo to the extent that a device vendor and tool vendor know what they need to do to work together. But device and application orchestration are different domains - as a tool vendor I may not see device management as my concern and as a device vendor I may wish to keep device management proprietary while fully embracing open workload orchestration. This separation of domains should not get in the way of the potential to solve them both the same way; it feels (to my non-technical senses) like the workflows are the same and the differences are in the vendor-specific space rather than open interfaces
These discussions were previously underway within the Discord channel but are being moved to here for further discussion.
This post is being created to finalize the scope/responsibility of this proposed focus group. Once the scope of the focus group is established, we will look to assign a lead to manage the focus group meetings and lead the content creation within the specification. Note: I plan to come back and update this main post following discussions and feedback provided below.
Proposed Focus Group: Device Update Mechanism
Proposed Scope: