gateway software lifecycle management using Balena hub and open fleet

kgiori commented 3 years ago

During the 29 April 2021 community WebThings meeting, we had the opportunity to learn about Balena hub and "open fleet".

Balena hub -- offers user-friendly "deploy with balena" (DWB) buttons as a means to install applications (in a Balena container environment, which seems analogous to a lightweight Docker). Marc (from Balena) participated in the meeting. He also created an example DWB to install the WebThings Gateway and it worked! (with some network/localization caveats that might still need tweaking). You can even enter your local Wi-Fi LAN credentials into the DWB build config web page and that will allow the RPi to directly connect to your network when it first boots (no need for the AP to client switchover of the traditional RPi install)

Balena open fleet -- not yet released as a service, but the goal is to enable open-source non-profit projects to be managed under one cloud umbrella, for free. In the case of the WebThings Gateway project, the idea we discuss was primarily to use open fleet for the purpose of software lifecycle management -- keeping the deployed gateway software up to date.

So what are the requirements for privacy, security, and long-term availability that we should request from Balena to optimize the open fleet service for our needs? Some ideas:

ability of WebThings Gateway device to be configured for when and how often to connect/disconnect with its open fleet management umbrella. (e.g., maybe it connects every 24-hours just to check for the availability of a sw upgrade, or maybe it remains connected all the time, or ...)
ability for the open fleet WebThings instance to not allow ssh access to all the connected gateways, or alternatively allow the gateways to block ssh access by default, and only enable it proactively (e.g., if there was a support service the user wanted that needed it temporarily)
open fleet administrator rotations. to keep the community administration oversight more fair, have the community periodically vote in new open fleet webthings project administrators (5?), so that transparency and fairness to the community are built into the sw lifecycle management process. (much like a non-profit board is made up of members who volunteer or accept a nomination, and then the community votes in their choice of candidates). "period" tbd -- maybe 2-yr terms, with overlapping (for consistency) rotation blocks?
dashboard ideas? and perhaps a web services API to pull some stats from the dashboard to the public-facing website. it would be nice to see the number of managed nodes, the number of DWB downloads, the growth of those numbers over time... all publicly visible from the webthings.io site, or a public facing open fleet site that the project site links to.
protect user privacy. collect minimal PII, implement privacy by design (something @flatsiedatsie is good at).

Add to this list! (ideally in brainstorm fashion, adding more requests or enhancements before debating what others come up with)

mpous commented 3 years ago

Thank you @kgiori for setting up this issue. Let me answer point by point and create an open discussion.

ability of WebThings Gateway device to be configured for when and how often to connect/disconnect with its open fleet management umbrella.

sure! This should be possible to manage using the fleet configuration variables e.g. RESIN_SUPERVISOR_POLL_INTERVAL (you can read more here)

ability for the open fleet WebThings instance to not allow ssh access to all the connected gateways, or alternatively allow the gateways to block ssh access by default, and only enable it proactively

At balena, we are focusing on trust and security but we still don't understand the core problem that is being solved by not having SSH access. What we don't want to do is give the illusion that not having SSH access to the fleet is somehow more secure; the fleet owner still has ultimate control over what software is running on the device so having SSH access (or not) makes little difference. Feel free to let us know here more of your thinking.

dashboard ideas? and perhaps a web services API to pull some stats from the dashboard to the public-facing website. it would be nice to see the number of managed nodes, the number of DWB downloads, the growth of those numbers over time... all publicly visible from the webthings.io site, or a public facing open fleet site that the project site links to.

Some of these KPIs will be displayed on the open Fleets dashboards generated for the apps on balenaHub. On Thursday 13th of May at 12pm UTC we will stream a Release Party where we are going to implement this (or part of).

protect user privacy. collect minimal PII, implement privacy by design

It would be interesting to understand more requirements here from you. From balena point of view, we want to be very transparent by default. Anything collected from the device would be what's presented in the dashboard. Having said that, it would be really interesting to understand better your point of view.

flatsiedatsie commented 3 years ago

the fleet owner still has ultimate control over what software is running on the device so having SSH access (or not) makes little difference

That doesn't sound good. I would personally like the end-user to have the last say. It's hard to promote a pieve of software as ethical and privacy respecting if it can be modified by a cloud-based actor at any point. Not having SSH enabled is part of this larger desire to have the end-user retain ownership and control of their data.

mpous commented 3 years ago

We do agree @flatsiedatsie but having SSH enabled or not, is not relevant when the developer of the application can install new services into the gateway.

Let's imagine that the developer can't access to the SSH but she/he is able to install a container that grant access to the SSH, install a bitcoin miner or worst ideas. Then enabling access to SSH is "not relevant", right?

flatsiedatsie commented 3 years ago

Yes I did understand your point, sorry if I wasn't clear.

I was trying to make another one. Any kind of cloud-based control (remote ssh enabling, update push, etc) might scare end users and subvert trust/harm the narrative around the gateway's USP's - specifically that everything runs locally and all data is stored locally. Ideally, in my opinion, the end-user would have final say. Updates would ideally be 'pull'.

It would be interesting to understand more requirements here from you. From balena point of view, we want to be very transparent by default. Anything collected from the device would be what's presented in the dashboard. Having said that, it would be really interesting to understand better your point of view.

I'd be happy to dive deeper into this. As privacy designers go, I'm pretty strict in these matters. The things I design usually don't collect any personal data (which includes IP addresses). Telemetry would always be explicit opt-in. In my work I generally don't rely on third party services and code, instead running things locally or on the first party server as much as possible.

If end users never see the balena online UI, then they wouldn't know what data is being collected.

WebThingsIO / gateway

gateway software lifecycle management using Balena hub and open fleet #2822