Device Agent updates at scale

Steve-Mcl commented 5 months ago

Epic

No response

Description

As a FlowFuse user: I want a means of updating the device agent at scale.

So that we get latest features and fixes:

First raised on NR forum: https://discourse.nodered.org/t/how-to-update-device-agent-version-in-scale/87260

Acceptance Criteria

[ ] TBD

Customer Requests

ZJvandeWeg commented 5 months ago

What if the user is running in a docker container? I guess we run npm install in the container?

I think just doing an npm install to a newer version and the exec(1) syscall might just do on the device agent side. Problem is that the agent doesn't just continue the current workload but will reinstall everything and so there's downtime? In a first iteration this might be acceptable?

hardillb commented 5 months ago

The problem with running npm install in the container is that the updates would be lost when the container is restarted.

The only way it would persist would be if the device-agent was on an external volume, which kind of defeats the point of having the device agent in a container.

robmarcer commented 5 months ago

Requested by - https://app-eu1.hubspot.com/contacts/26586079/record/0-1/11047851

Anmirazik commented 5 months ago

Im not sure if my suggestions are good enough but after some googling i found out that i can update the device agent at scale using ansible , I never had any experience with ansible for now but i think thats the best solution i can think of for now if the flowfuse team does not have any plans on developing this specific features . In addition , i believe it might be helpful if the flowfuse team can provide a working ansible playbook settings and writing tutorial for updating the device agent whether it is running using docker or natively. I believe this might help the flowfuse grow as well , what are your thoughts ? Thanks

joepavitt commented 4 months ago

Fundamentally, this boils down to the device-agent needing to be able to self-update. The key challenge here is that we aren't going to be aware of how the agent is installed, as Ben mentioned above, there are complexities with Docker for example.

We would need to break this down into common ways that the Device Agent is run, and work out if there is a consistent approach for each, and then establish where there are blockers.

joepavitt commented 4 months ago

Putting it into "Next" as this will require Engineering Design resource

Steve-Mcl commented 3 weeks ago

We would need to break this down into common ways that the Device Agent is run, and work out if there is a consistent approach for each, and then establish where there are blockers.

I have a pretty hot take on this....

If device agent (or a sibling version of it) were in fact a Node-RED plugin (like nr-assistant) that could be installed on Any device that is already running Node-RED, we open the doors to a much larger audience.

For example, there are some (very popular) industrial "off the shelf" hardware devices that run Node-RED that will likely never install our device agent due to current installation requirements. However, if they could get 80% of functionality (deploy flows, visibility/inclusion on a centralised platform (FF), auto snapshots, nr-assistant, nr-project-nodes, pipelines, etc) this could be a real win win for all.

Updating is no longer an npm install task but a palette node update. One day we may even incorporate a means of instructing the plugin to perform an update! - aka "at scale")

robmarcer commented 3 weeks ago

@Steve-Mcl's idea would be very useful because I come across lots of locked down PLCs which run Node-RED as part of their default image but trying to get the device agent on there is very hard. This is often because the PLC manufacturer is concerned about the risk to their customers' security of adding the device agent, they feel it's their responsibility to vet our code. This approach totally sidesteps the need to talk to the PLC manufacturer and opens up loads of places we can run and manage Node-RED.

knolleary commented 3 weeks ago

We did look at some sort of NR plugin originally. I think we would need to be very clear on what it could/couldn't achieve.

Running as a plugin means it cannot modify the Node-RED settings file. This mean:

no integrated security (we can't modify adminAuth or add the required auth middlewares)
no remote editing of flows
no way to manage/update Node-RED itself

That makes for quite a different experience and capability set than the Device Agent we have currently.

I do agree there is scope to have a standard plugin does open up some potential - we already have nr-tools-plugin which hasn't really had any attention since originally published.

ZJvandeWeg commented 4 days ago

Customer requested this today as they're scaling to 200 edge devices. If we have a technical solution here that would be grand.

FlowFuse / device-agent