WebThingsIO / gateway

WebThings Gateway
http://webthings.io/gateway
Mozilla Public License 2.0
2.61k stars 333 forks source link

Epic: Production quality base OS #2801

Open benfrancis opened 3 years ago

benfrancis commented 3 years ago

Currently the default WebThings Gateway software image is based on Raspbian/Raspberry Pi OS.

This has served us very well up until now, but it has some limitations:

  1. OTA updates - Whilst the gateway application receives automatic over-the-air updates, the underlying Raspberry Pi OS does not. This means that unless users manually update the OS on their gateway via the command line, they stay on whatever version of the OS was current when they first flashed the software image. This becomes a problem when new versions of the gateway application start to rely on features of the latest version of the OS. Upgrades to old versions may not be possible.
  2. Security - No updates means no security patches to the underlying OS, which may expose security vulnerabilities.
  3. Maintenance - Maintaining a fork of Raspberry Pi OS as we essentially do today is a lot of work. Without Mozilla's resources going forward we may not be able to continue to do this. It would be better if we could just maintain the gateway application and let someone else maintain the OS.
  4. Software footprint - Raspberry Pi OS includes a lot of software packages which we don't actually use and has quite high resource requirements for an IoT gateway.
  5. Hardware compatibility - Raspberry Pi OS is designed specifically for the Raspberry Pi single board computer and can not run on production consumer or enterprise hardware.

I'd therefore like to explore an alternative base OS for WebThings Gateway which:

  1. Supports automatic OTA updates
  2. Has better security through automatic software updates and containerisation
  3. Is ideally maintained by someone else, so we can focus on the gateway application
  4. Has a smaller footprint, reducing minimum system requirements
  5. Supports a wider range of hardware for consumer and enterprise use cases

It's important that we continue to support Raspberry Pi as a hardware target for existing and new hobbyist/educational users, who form the backbone of our community. It's also likely we may need to support the Raspberry Pi OS based image in some form for an extended period of time, as upgrading to a new base OS will probably require a manual re-flash.

Some potential candidates I am aware of are:

  1. Balena OS - runs docker containers¹
  2. Ubuntu Core - runs snaps²
  3. Fedora IoT - runs OCI images

If we do choose a containerised OS like one of the above, we may also need to re-consider the architecture of the add-ons system. It's likely that some existing add-ons would break if the gateway application was containerised, so we may want to consider making each add-on its own container for example.

Ideas welcome.


Footnotes:

  1. We have an existing docker image of the gateway, which makes Balena OS an attractive target, but unfortunately the pricing model of Balena Cloud is not a good fit for our application.
  2. We have a work-in-progress snap package of the gateway which could run on Ubuntu Core, but some add-ons may break as noted above.
jadonk commented 3 years ago

Balena does have a free tier for evaluation and there is OpenBalena if you want to run without their cloud service (100% open source). I really like the idea of doing a generic armhf/armv7hf Dockerfile-based setup that will run across Pi and BeagleBone on Balena. I'm happy to help some on this.

benfrancis commented 3 years ago

@jadonk Yes I have downloaded and installed balenaOS and agree openbalena looks interesting.

balenaCloud is cool, but the per-device pricing model is designed for "fleet owners" who own all their own devices and want to be able to SSH into them etc., which isn't the case with the WebThings Gateway where the hardware is owned by individual home users and we just want to push software updates to them. The project wouldn't be able to pay the per-device fee for managing other people's hardware and shouldn't have that level of access to end users smart home hubs. So we would need to run our own instance of openbalena and ensure that it allows us to push software updates, but without having SSH access to individual gateways.

This is in contrast with Ubuntu Core where there is no per-device fee and automatic software updates for the core of the OS are provided by Canonical, for up to ten years. We could create a custom OS image based on the vanilla core snap, plus our gateway snap, and push updates to end users of our application via the public Snap Store for free. SSH access is disabled on individual devices by default and has to be enabled on-device by registering it with an Ubuntu SSO account. The flip-side of that is that we'd be dependent on Canonical's centralised snap store.

I think next steps with Balena OS would be to test our existing docker image on the OS and evaluate openbalena to see what is missing as compared with the balenaCloud service, and whether or not it would be a good fit for our needs, those being:

  1. Push automatic OS updates
  2. Push automatic updates to the gateway application
  3. Not have SSH access to individual gateways

Using balenaOS with openbalena sounds more hands-on than what is offered by Ubuntu Core and I'd like to better understand what work would be involved in maintaining our own balenaOS-based images, managing an instance of openbalena and pushing out OS updates for devices ourselves.

If you are able to help with any of the above I would be most grateful!

madb1lly commented 3 years ago

Hi @benfrancis,

Good comparison of Ubuntu Core and BalenaOS, thanks for that.

Would being reliant on the Snap store a big issue? We'd somehow be reliant on someone else's central repos anyway, unless we did all that ourselves too which the objective of this issue is to get away from. Given Canonical's approach to LTS these days I think it very unlikely that we wouldn't get a long notice period if they decided to close the Snap store.

OpenBalena sounds like an unknown amount of effort, and therefore probably a lot!

I imagine we want to stick with Debian/Apt based distros, but I think that Fedora IoT (or Core, I'm not exactly sure which) provides a very similar basis as Ubuntu Core with the additional benefit that we could distribute the Gateway ourselves as a Flatpak (I think... I've not investigated this fully).

Cheers 🙂

jadonk commented 3 years ago

My experience is that Canonical wants to charge for everything and is very cagey about what is free and what is paid. Things I haven't seen in the Ubuntu Core solution include handling of provisioning, redundant OS images for OTA updates and VPN plus public URL support.

The use case most interesting to me is one where I have a Balena managed device and want to quickly try out applications like WebThings.

I will have to try Ubuntu Core some more, but I found the process to create Snaps more burdensome than making a Dockerfile for Balena apps.

Anyway, I can look into reproducing elements of the cloud service so it can be done all open source.

benfrancis commented 3 years ago

@madb1lly wrote:

Would being reliant on the Snap store a big issue? We'd somehow be reliant on someone else's central repos anyway, unless we did all that ourselves too which the objective of this issue is to get away from.

Agreed, I think overall that's a positive thing.

Given Canonical's approach to LTS these days I think it very unlikely that we wouldn't get a long notice period if they decided to close the Snap store.

Agreed.

OpenBalena sounds like an unknown amount of effort

Yes, I think it's worth prototyping with both to better understand the work involved.

I imagine we want to stick with Debian/Apt based distros, but I think that Fedora IoT (or Core, I'm not exactly sure which) provides a very similar basis as Ubuntu Core with the additional benefit that we could distribute the Gateway ourselves as a Flatpak (I think... I've not investigated this fully).

None of these options use apt, Ubuntu Core is built entirely on snaps. I haven't tried Fedora IoT but the website says it has "built-in Open Container Initiative (OCI) image support using podman or deploy containerized applications from popular public registries." Do they mean Docker Hub? I'm less familiar with Fedora IoT, but if it supports docker images (or similar) but with a business model more similar to Ubuntu Core then that could be worth exploring.

@jadonk wrote:

My experience is that Canonical wants to charge for everything and is very cagey about what is free and what is paid.

That's not my experience from working with Canonical on an Ubuntu Core based project over the last year. Distributing snaps via the public Snap Store is free, as are images for certified devices like the Raspberry Pi. The business model is that they charge for private snap stores (which we probably wouldn't need) and provide the SMART START consulting service to port the OS to new hardware. Pricing is here.

Things I haven't seen in the Ubuntu Core solution include handling of provisioning, redundant OS images for OTA updates and VPN plus public URL support.

I'm not exactly sure what you mean by provisioning in relation to WebThings Gateway. OTA updates are atomic/transactional/reversible with automatic rollback. The core snap and application snaps are updated separately. See here. We wouldn't need to pay for balenaCloud's VPN/public URL service because the WebThings Cloud tunnelling service already provides that. (Edit: Assuming openbalena can be untangled from their VPN system, I'm not sure whether that's feasible).

The process of creating snaps is definitely not fun, but I don't have experience with docker to compare it to. I am concerned that the containerisation of the gateway is likely to break a lot of add-ons due to tighter security restrictions, but that's probably true of any containerised OS. We may need to re-think the packaging system for add-ons with any of these options. I like the idea of docker images in principle because they seem more widely adopted than snaps. Some things seem more complex with docker (e.g. networking), but other things may be simpler. I'm not sure whether the rate limits on free Docker Hub accounts might be an issue? (Edit: Actually I think balenaCloud/openbalena may act as its own docker registry...).

Anyway, I can look into reproducing elements of the cloud service so it can be done all open source.

That would definitely be interesting to prototype, thank you.

madb1lly commented 3 years ago

None of these options use apt, Ubuntu Core is built entirely on snaps. I haven't tried Fedora IoT but the website says it has "built-in Open Container Initiative (OCI) image support using podman or deploy containerized applications from popular public registries." Do they mean Docker Hub? I'm less familiar with Fedora IoT, but if it supports docker images (or similar) but with a business model more similar to Ubuntu Core then that could be worth exploring.

Thanks for explaining - I'm not familiar at all with any of the options proposed or discussed; I may have got it wrong that Fedora IoT uses Flatpaks too. I don't know if they mean Docker Hub. If I get chance I'll look into it more but please don't depend on me.

tim-hellhake commented 3 years ago

I'm a bit confused. When I read this:

  1. Maintenance - Maintaining a fork of Raspberry Pi OS as we essentially do today is a lot of work. Without Mozilla's resources going forward we may not be able to continue to do this. It would be better if we could just maintain the gateway application and let someone else maintain the OS.

I thought you want to stop providing a custom image. But then you mentioned creating a ubuntu core based custom image here:

We could create a custom OS image based on the vanilla core snap, plus our gateway snap, and push updates to end users of our application via the public Snap Store for free.

So I assume you meant that the image creation process for Raspberry Pi OS is far more complicated than for ubuntu core.

I'm also very confused about the idea to use BalenaCloud/OpenBalena. If I understood it correctly the main selling point of BalenaCloud/OpenBalena is that a single person can manage a large number of devices efficiently. In the case of the Webthings Gateway it's the other way around. There are a large number of users who are usually managing a single device. I don't think that most of the users want to manage a second device just to manage the first one.

I have a lot of thoughts on this but I think it makes sense to focus on the problems we want to solve with this. I'm going to assume that the users won't host their own OpenBalena instance and won't pay for BelenaCloud.

  1. OTA updates - Whilst the gateway application receives automatic over-the-air updates, the underlying Raspberry Pi OS does not. This means that unless users manually update the OS on their gateway via the command line, they stay on whatever version of the OS was current when they first flashed the software image. This becomes a problem when new versions of the gateway application start to rely on features of the latest version of the OS. Upgrades to old versions may not be possible.

I checked the FedoraIoT update page and found no mention of automatic updates. You have to ssh into your machine and start the update on the CLI which is exactly what you would do on Raspberry Pi OS.

If you check the BalenaOS manual they refer to BalenaCloud. There is an CLI command but it Requires balenaCloud and will not work with openBalena or standalone balenaOS.

What would be the expectation here anyway? How should the user trigger the update of its device?

According to the docs Ubuntu Core updates 4 times a day and according to the snap overview even the kernel is a snap. This sounds like a real solution to the problem.

  1. Security - No updates means no security patches to the underlying OS, which may expose security vulnerabilities.

Solved by 1.

  1. Maintenance - Maintaining a fork of Raspberry Pi OS as we essentially do today is a lot of work. Without Mozilla's resources going forward we may not be able to continue to do this. It would be better if we could just maintain the gateway application and let someone else maintain the OS.

The build process for a custom BalenaOS image seems to be similar to the Raspberry Pi OS. You can build and modify your image from source but that's not really an improvement over the Raspberry Pi OS build process.

I found no instructions on how to build a custom image in the FedoraIoT docs. I assume it's the same here.

The custom image build process for Ubuntu Core looks very clean and simple. That would be definitely an improvement.

  1. Software footprint - Raspberry Pi OS includes a lot of software packages which we don't actually use and has quite high resource requirements for an IoT gateway.

I think this point only makes sense if the goal is to provide a custom image.

The footprint of FedoraIoT and BalenaOS is also very small but the creation of a custom doesn't look very appealing.

There are other distros which also provide docker support and are also very lightweight.

Ubuntu Core on the other hand looks like a good candidate as you have full control over the image content.

  1. Hardware compatibility - Raspberry Pi OS is designed specifically for the Raspberry Pi single board computer and can not run on production consumer or enterprise hardware.

Same as 4. If you don't provide a custom image then there is no hardware limitation. You can install the deb package on most of the hardware distros.

According to the image list Ubuntu Core supports the Raspberry Pi (32 bit + 64 bit) and X86 64 bit.

According to the docs, FedoraIoT supports the Raspberry Pi (32 bit + 64 bit), generic ARM64 boards, and X86 64 Bit.

According to the docs BalenaOS supports the Raspberry Pi, X86 32/64 Bit and a few other 32-bit arm sbcs like Beaglebone and Odroid.

Conclusion

From what I see Ubuntu Core brings the most benefits. It provides the tooling to build custom images in a simple and elegant way. The system keeps itself up-to-date with regular atomic updates. If the hardware limitation is ok I would go with that.

The actual benefits of BalenaOS and Fedora are that they have a really small footprint and provide an OCI runtime. But that's also the case for other distros like Arch Linux. And if it's the user's task to install the Gateway on the OS anyway you might as well let the user chose on which OS he installs the deb package or starts the docker container.

benfrancis commented 3 years ago

Thanks @tim-hellhake for this detailed analysis, it's very helpful.

I thought you want to stop providing a custom image. But then you mentioned creating a ubuntu core based custom image So I assume you meant that the image creation process for Raspberry Pi OS is far more complicated than for ubuntu core.

Yes exactly, it's not that I want to stop providing an OS image as such, it's that I want to reduce the work needed to create and maintain one, and not have our users get stuck on an outdated OS which never gets updated. Ubuntu Core makes it easy to build custom images with just the gadget snap (kernel) + core snap (OS) + our own application and then provides automatic OTA updates for the underlying OS so we don't have to worry about it getting out of date.

If I understood it correctly the main selling point of BalenaCloud/OpenBalena is that a single person can manage a large number of devices efficiently. In the case of the Webthings Gateway it's the other way around.

Yes exactly, this is what I mean about the business model being a bad fit. It would mean we need to maintain our own openbalena instance instead of using balenaCloud, which negates many of the advantages, and it may still be a bad fit.

I'm going to assume that the users won't host their own OpenBalena instance and won't pay for BelenaCloud.

That's definitely not my intention. That would be a huge barrier to entry.

OTA Updates & Security

I checked the FedoraIoT update page and found no mention of automatic updates. You have to ssh into your machine and start the update on the CLI which is exactly what you would do on Raspberry Pi OS. If you check the BalenaOS manual they refer to BalenaCloud. There is an CLI command but it Requires balenaCloud and will not work with openBalena or standalone balenaOS. What would be the expectation here anyway? How should the user trigger the update of its device?

OK, if that's the case then it might rule out Fedora IoT.

My understanding is that in theory we could push updates to gateways from a WebThings instance of openbalena. It just doesn't support some of the features of balenaCloud like "the web-based dashboard and updates with binary container deltas". This is something I'd like to understand better though, because I haven't tried it yet.

Maintenance

The build process for a custom BalenaOS image seems to be similar to the Raspberry Pi OS. You can build and modify your image from source but that's not really an improvement over the Raspberry Pi OS build process.

I found no instructions on how to build a custom image in the FedoraIoT docs. I assume it's the same here.

The [custom image build process](https://ubuntu.com/core/smartstart/guide/custom-image-creaBut then we have no control over what other packages are installed on the OStion) for Ubuntu Core looks very clean and simple. That would be definitely an improvement.

OK, if that's the case then I agree Ubuntu Core wins here.

Software Footprint

I think this point only makes sense if the goal is to provide a custom image.

The footprint of FedoraIoT and BalenaOS is also very small but the creation of a custom doesn't look very appealing.

There are other distros which also provide docker support and are also very lightweight.

Ubuntu Core on the other hand looks like a good candidate as you have full control over the image content.

I would like to continue to provide a custom image if possible, and I agree that Ubuntu Core is a very strong candidate as the base for that. As I understand it that mainly just requires creating a maintaining a snap package + a simple build step.

If we had to pick just one distribution format for WebThings Gateway going forward then I think there would be a lot of arguments for the docker image given it's the most versatile. A lightweight distro which bundles that docker image would be a good complement to that, but if the OS doesn't auto-update itself then that wouldn't be much better than the current Raspbian image.

Hardware compatibility

If you don't provide a custom image then there is no hardware limitation. You can install the deb package on most of the hardware distros.

If we want to continue to offer the various .deb and .rpm packages then we may need to find new maintainers for them. Currently I understand the packaging is all automated, but if one of those automations breaks then I don't know how to fix it.

It is definitely true that maintaining a .deb package hugely increases the potential hardware support. If we find we can't support the Raspberry Pi OS based image any more it might be less work to provide a .deb package which people can install on Raspberry Pi OS themselves so it's more clear that maintaining the OS is their own responsibility. The downside of that is that the initial installation has more steps.

According to the image list Ubuntu Core supports the Raspberry Pi (32 bit + 64 bit) and X86 64 bit. According to the docs, FedoraIoT supports the Raspberry Pi (32 bit + 64 bit), generic ARM64 boards, and X86 64 Bit. According to the docs BalenaOS supports the Raspberry Pi, X86 32/64 Bit and a few other 32-bit arm sbcs like Beaglebone and Odroid.

Raspberry Pi is the minimum requirement here.

Ubuntu Core 20 is very new and doesn't have any certified hardware yet. Ubuntu Core 18 is certified for use on various Raspberry Pi and Intel NUC boards, but the amd64 image probably supports a lot of other hardware which isn't certified.

Balena OS has a much wider range of certified hardware but also its Yocto-based build system makes it much easier to port to other hardware. (Canonical charge $30,000 to port Ubuntu Core to new hardware.)

Conclusion

From what I see Ubuntu Core brings the most benefits. It provides the tooling to build custom images in a simple and elegant way. The system keeps itself up-to-date with regular atomic updates. If the hardware limitation is ok I would go with that.

Having also spoken with someone who has used both Ubuntu Core and Balena OS in production, I am also leaning towards Ubuntu Core. I'm concerned that Balena may not be a great fit in practice due to its target market of "fleet owners". Their VPN system overlaps in functionality with our current tunnelling service but may not scale well with openbalena and could be very difficult to untangle from the OS if we didn't want to use it.

The list of certified hardware for Ubuntu Core is very limiting (beyond Raspberry Pi the certified Intel NUC boards are quite old now), but we could possibly live with that. The fact that there's a company you can pay to port to new hardware if necessary is both a positive and a negative, but overall I think I prefer that business model to paying a per-device fee for serving updates to devices!

The actual benefits of BalenaOS and Fedora are that they have a really small footprint and provide an OCI runtime. But that's also the case for other distros like Arch Linux. And if it's the user's task to install the Gateway on the OS anyway you might as well let the user chose on which OS he installs the deb package or starts the docker container.

My hope was that Balena OS could provide something similar to Ubuntu Core but with docker images instead of snaps (which are very centrally controlled by Canonical). The existence of openbalena could provide an escape hatch if we later wanted to run our own infrastructure or if Balena goes bust. But if we'd have to run our own instance of openbalena from the start, and even that might still be a bad fit, then that is less appealing.

Next Steps

In terms of next steps I think we need to:

  1. Get our snap package working, and evaluate how many add-ons break in confinement
  2. Test our docker image on Balena OS and try setting up an openbalena instance to better understand how it works

If we have to prioritise the formats in which we distribute WebThings Gateway my current thinking is:

  1. The docker image is the most universal format
  2. Ubuntu Core currently looks like the most promising base for maintainable OS images with support for Raspberry Pi + other hardware
  3. .deb packages help us reach more hardware and provide a self-installation option on Raspberry Pi OS and more
  4. .rpm packages help us reach more Linux distros

The current Raspberry Pi OS image is where a lot (most?) of our users currently are, but is going to be a real headache to maintain in the long term. We need to think about what to do with that.

madb1lly commented 3 years ago

It might just be me but unfortunately I've never found Docker very intuitive. If that is the only format which WebThings is available in in future then of course I'll learn about it so I can continue to use WebThings, but it might reduce the number of new people who will use it.

Options 2 and 3 are my preference.

Cheers 🙂

benfrancis commented 3 years ago

@madb1lly wrote:

unfortunately I've never found Docker very intuitive

It's not very familiar to me either.

Options 2 and 3 are my preference.

To be clear I'm not saying it's either or, and I'm not saying that we'll have to get rid of any of the current distribution formats. Just that if we can't find people with the expertise to do a good job of maintaining them them all then we may have to prioritise and pick a subset, and that's the order of prioritisation I currently have in mind. If someone wants to volunteer to maintain the Raspberry Pi OS image and upgrade it to the next long term support release then the "production quality" option could be in addition.

Debian + Docker

Another option I've seen used in a couple of places (Screenly OSE and NextBox) is Debian (or Raspberry Pi OS) running on the Raspberry Pi, acting as a host OS for a docker container. As @tim-hellhake pointed out, hosting docker containers is not unique to Balena OS.

kgiori commented 3 years ago

I asked R Nelson who maintains BeagleBone distros what he recommends after perusing this thread, and he commented: "i really like the idea of snap package for webthings on our am335x devices. docker is nice if you have cpu/ram and free space. snapd is in pretty good shape on debian. https://tracker.debian.org/pkg/snapd". e.g., in debian, docker.io is around 150Mb vs snapd's 50Mb installed size. Has anyone run the WebThings Gateway on a BeagleBone to get a sense of the responsiveness of loading web pages (especially the data logging page)? I'm curious how it would perform on a PocketBeagle. (Or if that platform would be more suited to making a web thing rather than being a gateway.)

jadonk commented 3 years ago

@benfrancis wrote:

@jadonk wrote:

My experience is that Canonical wants to charge for everything and is very cagey about what is free and what is paid.

That's not my experience from working with Canonical on an Ubuntu Core based project over the last year. Distributing snaps via the public Snap Store is free, as are images for certified devices like the Raspberry Pi. The business model is that they charge for private snap stores (which we probably wouldn't need) and provide the SMART START consulting service to port the OS to new hardware. Pricing is here.

I think using snaps can be fine and that is reasonably open. What I've seen issues with is distributing Ubuntu Core on a device. If we could have figured out snaps at the time, we might have shipped BeagleBone Blue with that as the default OS (for free). When we revisited this on BeagleBone AI, they told us it was going to cost bookoos of big bucks. No private store. We want everything open.

Anyway, as long as the parts are open and we can implement our own servers, I'm not so worried about using their infrastructure as a test ground (that might help them get customers).

I'll need to look at what they've done around provisioning and OS updates. Balena has done a nice job there.

benfrancis commented 3 years ago

@kgiori wrote:

"snapd is in pretty good shape on debian."

I like the idea of using Debian as a host OS because it's quite close to what we use now, with a much wider range of supported hardware, but it seems to me that using snaps on Debian would miss out on the main benefits of Ubuntu Core (OTA updates) while tying us to Canonical's snap store. That's why I suggested docker on Debian rather than snaps.

@jadonk wrote:

What I've seen issues with is distributing Ubuntu Core on a device. If we could have figured out snaps at the time, we might have shipped BeagleBone Blue with that as the default OS (for free). When we revisited this on BeagleBone AI, they told us it was going to cost bookoos of big bucks.

It's only free if you can use one of their vanilla images for certified hardware, which for core18 pretty much just includes Raspberry Pi and Intel NUC. Porting to BeagleBone would require Canonical's SMART START service starting at $30,000, which I imagine is why it would have been so expensive.

The .deb package or docker image is probably a better way of targeting BeagleBone. The Debian + docker OS image approach would be a good fit there too.

as long as the parts are open and we can implement our own servers, I'm not so worried about using their infrastructure as a test ground

One of the big criticisms of snaps is that you can't run your own snap store, because that part isn't open source. You can pay Canonical for a "branded store" and proxy requests to the snap store from your own servers, but the actual snaps still have to be hosted by Canonical.

I'll need to look at what they've done around provisioning and OS updates. Balena has done a nice job there.

OTA updates is what Ubuntu Core really excels at, I would argue it's better than Balena OS on that front (at least for our use case). I'm still not sure what you mean by provisioning?

madb1lly commented 3 years ago

I'll need to look at what they've done around provisioning and OS updates. Balena has done a nice job there.

OTA updates is what Ubuntu Core really excels at, I would argue it's better than Balena OS on that front (at least for our use case). I'm still not sure what you mean by provisioning?

On smartphones I think this refers to a sort of automatic remote configuration of the device depending on what network it is connected to and which country it is in. E.g. this can be used to automatically select frequency bands and what services are enabled or not, e.g. VoIP. I imagine you came across this with FirefoxOS @benfrancis? Maybe for a static device we can use the same term? A quick websearch revealed some links by Oracle and others that run server infrastructure.

In the context of WebThings maybe we could think of this as using a common image for all devices and provisioning automatically selects a specific configuration for that device... but that would first require the image to boot, so I'm not sure what provisioning would still remain to be done?

Cheers 🙂

tim-hellhake commented 3 years ago

Another option I've seen used in a couple of places (Screenly OSE and NextBox) is Debian (or Raspberry Pi OS) running on the Raspberry Pi, acting as a host OS for a docker container. As @tim-hellhake pointed out, hosting docker containers is not unique to Balena OS.

I checked some of the minimal distros which only provide a docker daemon but unfortunately, none of them provided automatic updates.

I like the idea of using Debian as a host OS because it's quite close to what we use now, with a much wider range of supported hardware, but it seems to me that using snaps on Debian would miss out on the main benefits of Ubuntu Core (OTA updates) while tying us to Canonical's snap store. That's why I suggested docker on Debian rather than snaps.

That's also my opinion. Docker has the advantage that it runs on a variety of distros.

kgiori commented 3 years ago

I recently met with a gal at Balena, because of my new role on the board of BeagleBoard.org, and learned about a their "Deploy With Balena" (DWB) capability that sounds nice -- described as an "easy one-click install". They set up a demo of DWB for the Home Assistant project. If we could get their support to make the WebThings Gateway available as a DWB one-click install, I would definitely be up for giving it a try.

Also note that Balena Cloud is free for up to 10 devices. I don't expect many individual users like me would need to run more than 10 gateways for their home, vacation home, parent's home, ... And although it would cost money for 10+ devices needed for commercial or enterprise deployments, it might be worth the money.

This doesn't mean that DWB needs to be the only supported solution for maintaining a device running the WebThings Gateway, but it seems like an option worth further experimentation.

kgiori commented 3 years ago

Another project that I recently learned about is the LF Edge project EVE-OS. Although the target for EVE is secure enterprise deployments "at the edge", rather than DIY smart home users, it is open source and security and reliability is of utmost importance. It doesn't look easy to me to get started currently (have to build from source), but it might be a fit for a specific piece of hardware (such as RPi 4) and a corresponding pre-built release image that could be flashed to a uSD card, then updated automagically OTA like our current RPi release. The other benefit of EVE is its portability across a wide range of powerful hardware.

benfrancis commented 3 years ago

@kgiori wrote:

I recently met with a gal at Balena, because of my new role on the board of BeagleBoard.org, and learned about a their "Deploy With Balena" (DWB) capability that sounds nice -- described as an "easy one-click install".

I think it would be cool to support this installation method if it's not too much work on top of maintaining the docker container, and someone wanted to maintain it. If I've understood correctly it's fairly similar to Ubuntu Appliances which are pre-built Ubuntu Core images dedicated to a particular application, except that there are more steps involved as the user has to sign up for a balenaCloud account first, create a custom image to download and flash, then deploy the application to their Pi after first boot. I think describing it as a one-click-install is an exaggeration.

Also note that Balena Cloud is free for up to 10 devices. I don't expect many individual users like me would need to run more than 10 gateways for their home, vacation home, parent's home, ... And although it would cost money for 10+ devices needed for commercial or enterprise deployments, it might be worth the money.

This assumes that each user signs up to Balena Cloud themselves and acts as their own cloud adminstrator, responsible for deploying OTA updates to their 1-10 gateways. This is not a straightforward process and is really aimed at developers/professional IoT fleet owners, not home users. Not only is this a huge barrier to entry, but would be a very different experience to the current one-time-install with automatic updates approach we have in place today.

Although it could be useful for enterprise users who own a fleet of gateways and have the skills to administer them, this is not a great fit for home users. I've discussed this with two people at Balena and someone who has used Balena in production and my conclusion is that if we wanted Balena OS to be a replacement for the current Raspbian image, we'd need to run our own instance of openbalena and customise it to our needs.

An Ubuntu Appliance could be a better drop-in replacement as it just requires the user to flash a pre-built image to their device like today, then they would get automatic updates without needing to sign up to a cloud service or administer anything themselves. When discussing another project with Canonical they seemed quite interested in in the idea of a WebThings Gateway appliance. However, as explained above it may require re-architecting our add-ons system to work inside/as snaps. We need to do some testing.

Another project that I recently learned about is the LF Edge project EVE-OS.

I've not heard of EVE and would be interested to hear more, but I think we might be looking for something a bit more mature.

flatsiedatsie commented 3 years ago

I wanted to add my two cents as well.

I work on the Candle smart home project, and have developed large number of addons for the Gateway, such as Zigbee2MQTT, Network presence detection, Voco, MySensors, Internet radio, Airport, Followers, Highlights, Photo frame, P1-adapter, Seashell, Square theme, Candle manager, Privacy manager, Power settings, Display toggle.. and more to come.

I think having auto-update is an important goal, as it would also aid in turning Candle into a commercially viable product. Our plan is to sell high-quality "kits" to adults which will require minimal assembly. In the package they would find a Raspberry Pi 4, some off-the-shelf USB sticks, a luxurious housing, and an attractive documentation booklet. My target audience is not tech-savvy, but sees their concern about privacy as the main barrier to getting into the IoT world.

I'm not as versed in the devops world, so I also have some questions, and can only really express my hopes, desires and worries without fully understanding how much work each option will cost.

Desires:

Questions:

Worries:

I mainly worry about how containerization might impact addon development.

I've noticed that the addon system is often compared to the addon system in the Firefox browser, as a way of explaining that addons should be kept separate and highly restricted. But for me personally this metaphore/blueprint doesn't hold water. Sure, addons that work like a bridge to come cloud server would not be hindered. But most of my addons focus on the Raspberry Pi itself, and squeezing as many features out of it as possible. I worry containerising addons may go too far in limiting what can be done with the hardware of the Pi itself.

Some examples may be useful. My addons try to avoid making changes to the gateway/Linux, as this could harm the functioning of other addons. Even with that limitation, here are some things that my addons currently do.

For my upcoming Hotspot addon:

For my upcoming Candlecam addon:

The Candlecam addon, and to a lesser degree Voco's satellite ability are designed with running the Gateway on a Pi Zero in mind. Here installing ReSpeaker software is required to make the ReSpeaker pi hat work. This requires the installation of kernel-level drivers.

To summarize: I need sudo to make the Pi a strong value proposition to potential customers.

Next to all this playing around with the Raspberry Pi's abilities, there is a second worry. As my addon collection grows, I increasingly see opportunities for synergy, where addons work together to create meta-features by working together. Here I worry containerization may make cross-addon cooperation overly complex, needing everything to go through API's.

Thirdly, I worry that containerisation may make it harder for people to get started developing software for the Gateway. I have never worked with containerisation before, and there is a reason for that: it feels daunting and complex! For me it has been fantastic that there are so many tutorials for Raspbian out there. I worry that if the new OS doesn't run some flavour of Debian all these tutorials might not work as well.

Fourthly, I worry that containerization is an expression of a reflex we technologist sometimes have to look for technological answers to questions that are more sociological in nature. Compartmentalisation seems to be a technology-driven answer to the question of "how do we avoid bad apples", by simply making it technologically impossible to do things. However, let's also explore other methods of keeping addons trustworthy, such as creating systems of trust between people? I've spent a lot of time on all these addons, and I would hope that could be translated into some system of trust based on a proven track record.

From that perspective I like how Synology's plugin system works with tiers of trustworthyness. They have a curated set of addons they vouch for. Then there is the option to install community-reviewed addons. Then there is the option to manually add new repository sources and even side-load software. At each level the user is informed and gets to decide what level they are comfortable with. More technical users at the cutting edge could help rate addons so that users with less knowledge can pluck the fruits some time later.

I know one recurring reply to my body of work is to "fork the gateway", but I would like to avoid that as long as possible. I would prefer to "keep the family together", since this is also of value to me and my potential customers.

Those are my two cents. As I said, I don't know enough about the technology to ascertain to what degree my worries are based on misconceptions.

benfrancis commented 3 years ago

@flatsiedatsie Thanks for sharing your thoughts.

* I would like the solution to work at a low monetary cost, ideally being zero. I do not wish to ask my users a recurring fee for using Candle.

Of course in practice none of these options are free (including the current solution) because they require one or more developers to maintain a distribution, which they may or may not be willing to do for free, and in some cases running infrastructure to provide updates as well. But there are various ways of funding that which don't all require charging end users.

* While I would be happy to do some simple modifications to my addons, I would like my existing addons's functionality to not be degraded.

In theory this should be possible, but depending on the solution we choose, add-ons may require significant modifications in order to use controlled system APIs rather than just running scripts as root as they can today.

* I would like addons to work together synergetically in the future. For example, Voco could handle audio streaming, Text-To-Speech or Audio Input for my other addons. Network presence detection could do a network scan for Voco to find satellites.

This should theoretically be possible for any of the solutions since all container systems tend to allow containers to communicate with each other in one way or another. An example might be using WebSockets to communicate, which is what add-ons already use to communicate with the gateway today.

* I would like the entire stack to be open / open source, so that I can tell my customers that they are buying a fully open source product.

All of the options we are considering are open source. Of course, there are different levels of open source. Raspberry Pi OS contains non-free binary drivers for example, and most versions of the Raspberry Pi require a non-free binary blob used by the GPU in order to boot. So "fully open source" may be a stretch.

* I would like the Gateway will still run on a Pi Zero.

As I understand it WebThings Gateway 1.0 doesn't work very well on the Pi Zero today (I don't have one to test myself). It may be tricky to continue supporting since as I understand it ARMv6 is no longer an officially supported build target for Node.js. In general my hope is to have a smaller footprint rather than a larger one, in order to support lower end hardware. But I think regarding the Pi Zero specifically, it's supported by balenaOS, but not Ubuntu Core (because Ubuntu no longer supports ARMv6). It is theoretically supported by Debian, but "falls uncomfortably between the processor families that Debian has chosen to target, between armel and armhf" so Raspberry Pi OS works better.

* I would like an OS that at some point makes it relatively easy to create SD cards with the Gateway pre-installed with some addons and minor cosmetic modifications.

This could actually become easier with a different base OS.

Questions:

* Could Raspbian be retrofitted to be more secure? What would happen if `sudo apt-get upgrade` is run once in a while? Wouldn't that keep the system up-to-date to some degree?

apt is not really designed to be run automatically, it often includes interactive prompts for the user to choose options on the command line.

There is a feature of Debian called unattended upgrades which I asked about above, but I'm not sure how well it works on the Raspberry Pi. The big problem is what happens when you get to the end of the support period of a major release (which is the problem we're about to come up against with Raspberry Pi OS). An upgrade from one version of a Debian-based distro to another tends to be a very hands-on process.

This may also be a problem with Ubuntu Core to a certain extent, since I'm not sure you can automatically upgrade from core18 to core20 for example, but Ubuntu Core provides 5-10 years of support per major version which would be a big improvement.

Worries:

I mainly worry about how containerization might impact addon development.

I worry about that too, I think it's the biggest challenge in changing base OS. Many add-ons currently require root privileges (which isn't really a sustainable feature if we want to claim the gateway is secure) and many only work on Raspberry Pi OS (which is a shame for users already using Docker/apt/rpm packages on other hardware).

I've noticed that the addon system is often compared to the addon system in the Firefox browser, as a way of explaining that addons should be kept separate and highly restricted.

I have drawn a parallel with Firefox add-ons and the Firefox OS app store in the past because that's where most of my experience comes from, but the same principles apply to any add-ons system or app store, which eventually have to put restrictions in place. This is as much a social problem as it is a technical one (see below).

  • Scan and use serial ports (Candle Manager flashes devices via the Arduino Command Line Interface)

  • Read files outside of the addon directory. Some of this requires sudo.

  • Change ALSA audio levels.

  • Turn an attached HDMI display on or off

  • Come with binaries that require sudo to run (Airport for Airplay audio streaming, Arduino CLU, running a built-in Zigbee2MQTT instance, Voco voice detection)

  • Run, kill or reboot processes.

  • Do network scans (Network presence, Voco)

  • Disable NTP updating (power settings)

  • Run a Flask webserver on a random port

  • Delete the internal logs (privacy manager)

  • Modify, delete or create datapoints in the device data logs (privacy manager). And yes, I agree there should be an API for this, where the user has to give permission for addons to access this data.

  • Scan for and pair with bluetooth devices (unoffical addon)

Most of this (but possibly not all) is possible on Ubuntu Core. See the list of interfaces for an idea of what is possible inside snap confinement. The biggest difference is that you would need to use system-provided APIs (e.g. using NetworkManager via dbus instead of manually editing configuration files as root).

I'd be interested to know how balenaOS compares, or docker running on a vanilla Linux distribution.

To summarize: I need sudo to make the Pi a strong value proposition to potential customers.

I'm sorry, but this is simply not sustainable for the future if we want to claim that WebThings Gateway is secure. (see my explanation below).

I worry containerization may make cross-addon cooperation overly complex, needing everything to go through API's.

I'm less worried about this, it should be relatively straightforward to allow add-ons to communicate with each other (although beware you risk creating dependency problems if add-ons start to rely on each other).

Thirdly, I worry that containerisation may make it harder for people to get started developing software for the Gateway.

Regrettably I think this is probably inevitable, yes, at least if my experience with snaps is anything to go by. Writing secure code is harder than writing insecure code.

Fourthly, I worry that containerization is an expression of a reflex we technologist sometimes have to look for technological answers to questions that are more sociological in nature. Compartmentalisation seems to be a technology-driven answer to the question of "how do we avoid bad apples", by simply making it technologically impossible to do things. However, let's also explore other methods of keeping addons trustworthy, such as creating systems of trust between people? I've spent a lot of time on all these addons, and I would hope that could be translated into some system of trust based on a proven track record.

This is the approach we currently use. Our current approach to add-ons security is basically to manually review source code of all add-ons, every time they are updated, to ensure they are not doing anything nefarious or exposing a security hole. Only approved add-ons can be added to the add-ons directory.

The problem is that this process doesn't scale, because once you reach a certain number of add-ons (and we have already exceeded a hundred), the add-ons review team (essentially currently me and Tim) can not possibly manually review every add-on to this degree. This is already the case today, which potentially exposes users to a level of risk which wouldn't be acceptable for a commercial product.

Maybe you trust yourself, but do you trust all other add-on developers? Or do you trust me or Tim to manually review every line of source code of every add-on, in multiple programming languages, to make sure they're not putting your smart home at risk? I know I don't. Restricting add-ons to fixed APIs and/or a security sandbox is not a silver bullet, but it does add a significant extra layer of security and is a best practice in nearly every mature software ecosystem.

From that perspective I like how Synology's plugin system works with tiers of trustworthyness.

I'm not familiar with Synology's system, but I would be open to something like that. It's not dissimilar from the way you can add additional apt repositories in Debian or similar, where you can also manually install .deb packages. I have suggested the idea in the past of allowing users to add additional add-on directories.

The problem we're going to come up against is if we choose a base OS where these kinds of restrictions are enforced at the OS level and can't easily be overridden.

A problem with choosing Ubuntu Core for example is that if every add-on was a snap package, then unless we paid for our own branded app store (which is very expensive), all add-ons would need to pass Canonical's snap store review, which from my experience is a lot stricter than our current add-ons review process. The only other way to do it would be for the gateway and all its add-ons to share a single snap and for the gateway to request all the permissions up front, but that's unlikely to be allowed. FWIW It is possible for users to side-load un-reviewed snaps on Ubuntu Core.

I know one recurring reply to my body of work is to "fork the gateway", but I would like to avoid that as long as possible. I would prefer to "keep the family together", since this is also of value to me and my potential customers.

Those are my two cents. As I said, I don't know enough about the technology to ascertain to what degree my worries are based on misconceptions.

I think your concerns are reasonable and I share many of them.

Unfortunately I also don't think it's realistic to simultaneously meet the requirements of:

  1. Providing automatic software updates
  2. Providing those updates for free
  3. Making the gateway secure
  4. Giving add-ons root privileges with no restrictions on what they can do

Whatever solution we choose (including the current one) is going to be tradeoff.

One way to meet conflicting requirements is indeed to maintain separate distributions (i.e. fork the gateway). For example there could be:

  1. A free community version with looser restrictions but with poorer security and manual updates
  2. A paid commercial version with better security and automatic updates but tighter restrictions

Lots of open source projects take this kind of approach, e.g. Fedora/RedHat, Chromium/Chrome and Screenly OSE/Screenly

Ideally I'd like to meet everyone's needs with a single distribution, but that may not be possible.

flatsiedatsie commented 3 years ago

It could be the case for a more vanilla Linux distribution like Debian running Docker containers

So theoretically we could have an OS that supports containers, but might also allow things to be installed outside of that?

This is the approach we currently use. Our current approach to add-ons security is basically to manually review source code of all add-ons, every time they are updated

I really wasn't trying to describe the current situation. What I meant with the Synology example is something akin to what you see with Domoticz, where there is a "cutting edge beta" version of the system, and a "settled stable" version of the system. New addons would first become available in the cutting edge OS without any restrictions (no code review). Users of this cutting edge version (developers, enthousiasts, hypefreaks) might sometimes install these addons, knowing the risks. These users can rate the addons. From 0 to 5 stars. And they may leave a comment, and maybe check a box labeled "I actually looked at the code a bit". Over time some addons may be rated highly. Perhaps after at least 10 ratings with an above 4 star score, then these addons become visible to the users of the stable version.

Some people might develop multiple addons (like me), and from there gain a positive reputation. They might become known in the comunity. Someone has their emailadres and has maybe skyped with them. The point is: at a certain moment it becomes reasonable clear that the developer is not a Russian hacker with malicious intent. Af that point new addons might only need 5 higher than 4 star ratings before they are considered "safe enough".

That's what I mean by harnessing the social aspect of addon development.

Over time this would result in a few tiers of trustworthyness.

Tier 1. Addons being developed and maintained by core community members. These would likely be pretty popular addons that almost everybody uses. E.g. the Zigbee addon. These are akin to Synology's own branded addons that you can download from inside the Synology UI. These could have a little icon similar to confirmed accounts on Twitter. Developers of these addons are probably on a first-name basis with eachother.

Tier 2. Addons that have been around a long time, are stable, and have lots of reviews. This might be the Network Presence Detection addon. It's been around a long time, it works, and doesn't change drastically anymore. Users that want to install this addon would get a "There is no warranty on this, but the risk is low" popup.

Tier 3. Addons that are from new developers or that are very fresh. The people running the beta version of the gateway have tried it, and they haven't run into any crashes, and haven't spotted any obfuscated code". Run at your own risk.

Tier 4. Untested apps, version 0.0.1, that have just been made available to the beta community. Run if you're curious and know how to fix your system if something goes wrong. Also on tier 4: users of the stable version that choose to sideload an addon, or install a custom addon source. Crazy, but if you want to do it, you can do it. It's your device after all.

Maybe you trust yourself, but do you trust all other add-on developers?

This is a strawman. Ofcourse I don't trust all developers. But if there was a system in place that took karma into account - as many software ecosystems do - then yes, I would be able to trust a lot of them.

Now am I saying such a system should replace restricting things technologically? Not at all. In fact, I would welcome something that works like the permission system on the iPhone or Android where a user is asked "This addon would like to access your microphone, is that ok?". Or "This addon is requesting super user rights, is that ok?".

But so far I don't see this social part of the equation being discussed. So that's why I wanted to bring it up. We are racing towards containerization for security, without acknowledging that security - trustworthyness - can be reached in multiple ways.

Ironically, the race to restrict things technologically could also lower overal system security.

For example, currently I'm working on an addon that lets the Raspberry Pi generate a wifi hotspot. Users can connect wifi-based IoT devices to that hotspot. The addon then monitors what domains the wifi devices try to connect to, and can block that access on a per-domain basis. All this using an easy to use UI.

hotspot-screenshot

That addon needs above average permissions. But the bigger picture is that for the end user it would actually create a new level of security in their smart home. They would be able to learn what their devices are up to, and block unwanted activity. Who knows, perhaps a community of users could use the wisdom of the crowd to crowdsource a communcal blocklist that handles all kinds of commercial devices. Perhaps devices could be routed to small internal servers so that they think they are connected to the cloud, but in fact can now be controller from the LAN instead. But I digress. The point is that from a holistic point of view this addon greatly increases security for the end user, by adding a layer of protection around their wifi devices. By showing them what their IoT devices may be up to, it also educates users to be more security aware. Maybe they'll learn from experience why they shouldnt buy cheap Chinese wifi-enabled lightbulbs.

Currently I'm noticing a lot of pushback to making this addon available. That blows my mind, because to me, if I were choosing between smart home controllers, then a controller that has this feature would have a serious selling point.

I would like an addon system where it becomes easier to create features that no other open source smart home controller has.

If the general consensus is that containerisation is wanted, then that's fine. But I wasn't seeing these things being discussed, so I thought I'd bring it up. Companies like Synology don't use containerisation for security, yet their software runs all over the world, and supports a gigantic list of community software.

Oh, and:

As I understand it WebThings Gateway 1.0 doesn't work very well on the Pi Zero today

It does! It's not a speed deamon, and memory is not endless. But it's perfectly adequate, and works great as both a voice control satellite and smart doorbell / security camera. It actualy really cool to have the same UI on multiple devices. It's great for development of features that rely on multiple nodes in a home, such as the voice control satellite. The price and size is also a USP for these use cases. So for me it would actually really matter that the Gateway would still run on the Pi Zero, if at all possible.

// Perhaps the Pi Foundation will give us a new Pi Zero at some point. That would be great.

madb1lly commented 3 years ago

Hi @benfrancis,

I don't fully understand all the possibilities and restrictions of the Snap store nor what you're envisioning doing with Ubuntu Core but I know it gets some FLOSS enthusiasts a little upset sometimes. I listened to this podcast with Alan Pope from the Ubuntu Snaps team which suggested that the vast majority of the Ubuntu Snaps store is open source: https://latenightlinux.com/late-night-linux-extra-episode-14/

Therefore, to avoid that we are dependent on the Snap Store couldn't we point the gateway to our own Snap Store, at least for addons?

Cheers 🙂

benfrancis commented 3 years ago

@flatsiedatsie wrote:

So theoretically we could have an OS that supports containers, but might also allow things to be installed outside of that?

Yes, if we used a vanilla Linux distribution like Debian and add-ons system used docker containers, then it would be possible for users to install packages using other packaging systems themselves, but on the command line.

I really wasn't trying to describe the current situation.

I realise that, I was trying to explain that our current solution is predominantly a social one rather than a technical one, and the problems with scaling that system.

Over time this would result in a few tiers of trustworthyness. Tier 1. Addons being developed and maintained by core community members. These would likely be pretty popular addons that almost everybody uses. E.g. the Zigbee addon. These are akin to Synology's own branded addons that you can download from inside the Synology UI. These could have a little icon similar to confirmed accounts on Twitter. Developers of these addons are probably on a first-name basis with eachother. Tier 2. Addons that have been around a long time, are stable, and have lots of reviews. This might be the Network Presence Detection addon. It's been around a long time, it works, and doesn't change drastically anymore. Users that want to install this addon would get a "There is no warranty on this, but the risk is low" popup. Tier 3. Addons that are from new developers or that are very fresh. The people running the beta version of the gateway have tried it, and they haven't run into any crashes, and haven't spotted any obfuscated code". Run at your own risk. Tier 4. Untested apps, version 0.0.1, that have just been made available to the beta community. Run if you're curious and know how to fix your system if something goes wrong. Also on tier 4: users of the stable version that choose to sideload an addon, or install a custom addon source. Crazy, but if you want to do it, you can do it. It's your device after all.

Bearing in mind that our current add-ons directory is just a JSON file hosted on GitHub which Tim essentially maintains on his own, this sounds very complicated. We would need someone to implement support for multiple directories, different UI for different classes of add-ons and a ratings system, none of which we have today, and policies and people to decide which add-ons belong in which tiers.

More importantly, this doesn't fulfill any of the requirements I listed at the top of this issue, which would still need to be dealt with.

I agree that there is both a social and technical component to the way a software ecosystem works, and I think you're right to bring it up, but social solutions require people to implement them, which is something we're lacking right now.

@madb1lly wrote:

I don't fully understand all the possibilities and restrictions of the Snap store nor what you're envisioning doing with Ubuntu Core but I know it gets some FLOSS enthusiasts a little upset sometimes.

I'm afraid packaging systems are one of those things in the open source community which always divide opinion as everyone has their preferred solution. I suggest all we can do is to try to objectively evaluate the possible solutions against a set of concrete requirements.

My observation has been that many of the reservations people have about snap packages are in the context of cross-distribution desktop applications, e.g.

  1. The downside of bundling all your dependencies inside the package is that the package is larger than it would be if you relied on a dependency tree, and desktop snaps can sometimes be slow to start
  2. Snap packaged applications often don't match the look and feel of the distribution they're running in, because they're designed to work across multiple distributions

These problems don't really apply to our use case of a headless IoT gateway which doesn't run inside a desktop environment. For our use case the things that snaps do really well (automatic software updates, containerisation and an easy way to bundle them as a full OS image based on a very lightweight OS) make them very attractive. There is one other big issue though (see below).

I listened to this podcast with Alan Pope from the Ubuntu Snaps team

This is an interesting listen. Popey (who I was sad to learn has recently left Canonical) touches on many of the issues we've been discussing in this thread about the scaling issues of a curated app store and it was notable to me that the Snap Store is actually characterised in that discussion as having fewer barriers (e.g. review steps) to publishing than traditional Linux distributions (e.g. using .deb or .rpm packages). I think that's true as long as you're only using interfaces that support auto-connection and they can therefore be automatically reviewed. My experience of trying to publish a snap which required manual review and publisher vetting was that the process was actually very long [1].

which suggested that the vast majority of the Ubuntu Snaps store is open source: https://latenightlinux.com/late-night-linux-extra-episode-14/ Therefore, to avoid that we are dependent on the Snap Store couldn't we point the gateway to our own Snap Store, at least for addons?

Unfortunately not. Probably the biggest criticism of snaps is that nobody except Canonical can host their own snap store. They try to say this isn't the case because:

  1. At a low level creating a snap package is just creating a read-only SquashFS file system with a .yaml file containing metadata and the snap store is just a web server, both of which anyone could implement themselves.
  2. There is a snap "proxy" tool available which you can use to host your own server on your own premises from which snaps are fetched

However, in practice Ubuntu Core can only install snaps from either the public Snap Store, or a paid branded snap store (IIRC prices start at $30,000 and have a recurring fee) hosted by Canonical. You can host a proxy on your own premises in order to intercept requests for snap packages, but the actual packages have to be hosted by Canonical and the snap server software is closed source. This means that you can have snap packages hosted for free and benefit from automatic software updates for 5 years or more, but only if you use the public store and adhere to its terms of service.

Fundamentally, if we want to solve the problems I listed at the top of this post we either need to maintain our own Linux distribution (which is kind of what we've been doing so far in a simplistic way with some known limitations, and I personally don't have the skills or resources to continue doing on my own) or use someone else's (which means conforming to their conventions and terms of service).

We therefore need a solution which meets as many of our (not just my) requirements as possible, but we can practically maintain with the resources we have available.

I therefore suggest that we now move beyond this theoretical discussion (which has been great) to prototyping potential practical solutions. That could include:

  1. Snaps on Ubuntu Core (which is what I plan on starting with, since it seems the closest to meeting all of the requirements I listed at the start, but I'd like to better understand its limitations and implications for add-on developers)
  2. Docker containers on balenaOS (which is attractive because it theoretically doesn't rely on a centralised app store and could use our existing docker image, but depending on how we use it may be harder for end users or require us running our own infrastructure)
  3. Docker containers on Debian (which is attractive because it could use our existing docker image and is close to our current Raspbian base, but may have limitations regarding automatic OS updates)
  4. gzipped tarballs on Raspberry Pi OS (Our current solution, which requires someone with the necessary skills to volunteer to upgrade us to the next major release of Raspberry Pi OS and has known limitations we're eventually going to need solutions to)

If anyone wants to try prototyping any of the approaches above, or an alternative approach, that would be most welcome. We will eventually choose one or more of these solutions to maintain going forward, depending on which have the most support and people to maintain them. This also doesn't rule out maintaining additional package formats like .deb and .rpm if people volunteer to maintain them (or the scripts which generate them).

Footnotes:

  1. In my case I had an application I wanted to package as a snap that acted as both a web server and a web client which both ran as daemons. Daemons run as root by default but they didn't want an application running a browser engine as root because of the low level control a browser engine can have of system processes, so I had to find a way to get my web server and web client to run as a non-root user which turned out be quite complicated because nobody had done it before. This was an unusual example, but I can envisage gateway add-ons doing other weird and wonderful things that may require manual review.
flatsiedatsie commented 3 years ago

Probably the biggest criticism of snaps is that nobody except Canonical can host their own snap store.

That's what I was trying to refer to when I talked about the open source stack (not the closed source firmware on the Raspberry Pi).

We would need someone to implement support for multiple directories, different UI for different classes of add-ons and a ratings system, none of which we have today, and policies and people to decide which add-ons belong in which tiers.

Of course it would be some work. But would it be as much work as implementing a container-based redesign of the addon system? I have limited knowledge, but it seems like containerization is a more complex change. And unlike the containerisation, I could even help with this. Addon management could be an addon. Like a simple app store. Different commercialisations of the Gateway could each have their own appstore in order to manage what is available to users, and leave some addons out. This is something the Candle product would have to do anyway.

The introductory post makes it seem like containerisation is a foregone conclusion to improve security, but I really hope that's still up for a vote. Totally overhauling the addon system is something that looms like a cloud on the horizon, since that proces itself would create a period of uncertainty and upheavel for customers. It would make short-to-mid term commercialisation of what exists a bit daunting. Imagine users who paid for a brand new controller, and suddenly addons stop working. This already happened with the switch to 1.0, where I quickly had to update all my addons to make it work with the new generate-from-github system. A shift that I fully supported, since it made the code for all addons open source and inspectable by default.

For a project that's low on manpower, doing such a large technical overhaul seems risky? That's why I wanted to suggest another approach that can leave the system in tact, but adds security through crowdsourced trust. Forcing all addons to be open source on Github was a step in that direction.

That's why I'd like to also suggest another option to explore (or perhaps as a modification of option 4), which is the option I originally thought this entire discussion would be about:

  1. gzipped tarballs on an OS that updates itself and can still run on the Raspberry Pi. (With an app-store addon that offers insight into the trustworthyness of addons.)

In the end, the practical question that hasn't been answered for me is: what limitations would the containerization pose for my work?

Would it be possible to have the Hotspot addon work and be accepted if we go in the containerization direction?

tim-hellhake commented 3 years ago

I'm not too eager to switch to Ubuntu Core. The automatic updates and the image-building process seems great but it feels a bit like a jail. In theory, everything is open but in practice, you need to use the official Snap Store. I wasn't aware of the review process but this even feels more locked-in.

I would prefer to use a container-based solution, but I do see the need for automatic updates. Maybe we work things out with BalenaOS.

If anyone wants to try prototyping any of the approaches above, or an alternative approach, that would be most welcome.

I'll take a look at openBalena.

flatsiedatsie commented 3 years ago

I would prefer to use a container-based solution, but I do see the need for automatic updates.

I don't quite understand that sentence.

flatsiedatsie commented 3 years ago

Most of this (but possibly not all) is possible on Ubuntu Core. See the list of interfaces for an idea of what is possible inside snap confinement. The biggest difference is that you would need to use system-provided APIs (e.g. using NetworkManager via dbus instead of manually editing configuration files as root).

I'd be interested to know how balenaOS compares, or docker running on a vanilla Linux distribution.

Agreeing with Tim, it sounds like with snaps each addon would have to be in the Snaps app store, making it more difficult to create addons.

To aleviate my worries, I'm looking into understanding Docker in order to figure out what would still be possible, and what wouldn't. So far I've found docker applications that allow HostAPD to work, so that's really nice to read. So I'm warming up to it :-)

As far as I can tell, some things would still need to be set outside of the container? E.g.

sysctl net.ipv4.ip_forward

Similarly, using the Raspberry Pi camera still seems to require some things outside of the container to be set.

I guess there would be some way to request the host-OS to do that? Is that the idea? Are there any thoughts, proposals or examples of what the addon API eventually look like?

Most of this (but possibly not all) is possible on Ubuntu Core

I'm assuming the same is true for Docker? What functionality specifically would be limited?

I created a Balena OS SD card as an experiment. But I quickly ran into this:

The balena ssh command also requires an SSH key to be added to your balena account: see SSH Access documentation. The balena key* command set can also be used to list and manage SSH keys: see balena help -v.

Which implies I need a Balena account to work with Balena OS? Losing privacy points there.. Does this mean that Balena would know about every installed device?

I also noticed there is a package for Debian/Raspbian that keeps it up to date: unattended upgrades.

tim-hellhake commented 3 years ago

I would prefer to use a container-based solution, but I do see the need for automatic updates.

I don't quite understand that sentence.

Sorry I was a bit low on time. What I meant was: Ubuntu Core solves the problem of automatic updates very well. But I would rather use a container-based solution because it's more open in the sense that it's not dependant on a single corporation or implementation. Unfortunately, it's hard to find something similar which is container-based for ARM devices. So it's a bit like an open container-based solution vs something that has out-of-the-box support for automatic updates.

tim-hellhake commented 3 years ago

Similarly, using the Raspberry Pi camera still seems to require some things outside of the container to be set.

I guess there would be some way to request the host-OS to do that? Is that the idea? Are there any thoughts, proposals or examples of what the addon API eventually look like?

You need to do this when you start the container. Currently, all addons run inside the gateway container, which means you need to give the gateway container all privileges upfront. If we start the addons as separate containers, we could add something to the manifest that says This addon needs access to the camera device. This would also allow us to use minimal rights for most of the addons while giving more privileges to some addons after a thorough review.

I created a Balena OS SD card as an experiment. But I quickly ran into this:

When do you get this? If you use OpenBalena you need to configure the image using the CLI.

I also noticed there is a package for Debian/Raspbian that keeps it up to date: unattended upgrades.

I'm not sure how reliable this is. And I guess most users don't want to ssh into their gateway to fix a broken system. Additionally, it won't dist-upgrade your system. I'm not sure if this will be different with an upgrade from Ubuntu Core 20 to 22, though. OTA updates usually update your system either completely or not at all. There should be no intermediate state.

createcandle commented 3 years ago

Quick thought: even if addons are separated in the back-end, everything still comes together in the UI. A malicious addon could read the values from the UI's data structures or HTML and send them somewhere. How would containerization stop that?

benfrancis commented 3 years ago

Quick thought: even if addons are separated in the back-end, everything still comes together in the UI. A malicious addon could read the values from the UI's data structures or HTML and send them somewhere. How would containerization stop that?

It can't, containers help secure the back end code that runs on the gateway, not the front end code which runs on the client. Securing the front end code of extension add-ons requires another approach. The solution that web browsers came up with was a fixed extension API which has its own sandboxing and permissions system.

We currently re-use the manifest format and packaging system from browser add-ons, but we don't yet implement such a permissions system or fixed API, since it felt premature to fix those things until it became clear what kinds of things add-ons would need to do.

tim-hellhake commented 3 years ago

@benfrancis I've done a little research on Fedora IoT. Originally I searched for OTA Update frameworks and read a bit about ostree. Fedora IoT uses rpm-ostree for system upgrades. They have no automatic update mechanism but the rpm-ostree daemon has a DBUS API. That's what the CLI uses. We could use it to trigger atomic updates from the gateway. This way the user could also decide when to update the system and could monitor the progress. What do you think about that?

benfrancis commented 3 years ago

@tim-hellhake I have to admit Fedora IoT is the option I know least about. I don't know a lot about the rpm-ostree "hybrid image/package system" or podman vs. docker for containers.

On the surface Fedora IoT seems to tick many of the boxes, but trying to modify a system not designed for unattended upgrades to do automatic updates seems like a risky strategy. Allowing users to manually trigger updates from our UI is great, until something goes wrong! It does seem worth prototyping though, if you'd like to explore that further?

From the little I know about Fedora IoT it does seem like it could be a potential middle ground between Ubuntu Core and balenaOS. Curiously, Fedora CoreOS does support automatic upgrades and uses the same rpm-ostree system, but seems to be aimed at cloud servers rather than edge/IoT devices and doesn't support ARM.

Fedora IoT may have higher system requirements than Ubuntu Core (e.g. recommended 1GB RAM vs. 256MB RAM for core18 and 384MB RAM for core20). I'm also not sure whether Fedora IoT supports Raspberry Pi 4 yet.

flatsiedatsie commented 3 years ago

I'm also not sure whether Fedora IoT supports Raspberry Pi 4 yet.

it does. I haven't tested it myself, but I found a post from someone who did, and said it worked flawlessly.

I also tried to look into Fedora IoT. It seems in general 'provisioning', and thus putting the cloud back in control, is inevitable? I wanted to find out if Fedora IoT would offer a way to have the cloud infrastructure open source as well. I couldn't find an answer to my question.

From what I can tell Podman is Drop-in replacement for Docker. It's an open standard. In general I get the idea that the Fedora option is about being as open as possible, which is big plus.

They actually mention Webthings Gateway as a target for their platform: https://docs.fedoraproject.org/en-US/iot/prd/

tim-hellhake commented 3 years ago

@benfrancis

podman vs. docker for containers

Both runtimes will happily accept OCI images. The big difference is that docker is client/daemon based while Podman uses a fork/exec model. If you give a user access to the Docker daemon, you basically grant him root rights to the system. Podman, on the other hand, can be used without having root rights. If you need a drop-in replacement for docker, you can start Podman in daemon mode.

On the surface Fedora IoT seems to tick many of the boxes, but trying to modify a system not designed for unattended upgrades to do automatic updates seems like a risky strategy.

Curiously, Fedora CoreOS does support automatic upgrades and uses the same rpm-ostree system

They decided to outsource automatic upgrades in CoreOS to a daemon called zincati. The daemon controls rpm-ostree to orchestrate automatic upgrades. Unfortunately, it is not available on FedoreIoT.

but trying to modify a system not designed for unattended upgrades to do automatic updates seems like a risky strategy

It's more extending than modifying. The mechanisms for atomic updates are already there, but nobody is triggering them. I'm sure that zincati is pretty well designed, but I wonder what they are doing if the automatic upgrade fails.

tim-hellhake commented 3 years ago

I'm not sure if that is new, or I just missed it, but there is a section about automatic updates in the FedoreIoT wiki.

tim-hellhake commented 3 years ago

It seems in general 'provisioning', and thus putting the cloud back in control, is inevitable? I wanted to find out if Fedora IoT would offer a way to have the cloud infrastructure open source as well.

It seems that there is no cloud-based business model behind FedoraIoT. They just offer you a slim OS with container support and that's it. You are responsible for integrating it into whatever infrastructure you have. In our case, the only infrastructure is the pi itself. Thus my idea to let the gateway be in charge of the update process.

In general I get the idea that the Fedora option is about being as open as possible, which is big plus.

That's also my impression. BalenaOS is focused on their business model which all about centralized cloud solutions. OpenBalena is basically a slimmed-down headless single-tenant version of their cloud service. FedoraIoT on the other hand looks more like a tool to create your own business model.

They actually mention Webthings Gateway as a target for their platform: https://docs.fedoraproject.org/en-US/iot/prd/

Interesting :grimacing:

kgiori commented 3 years ago

Marc on the Balena team created the below "Deploy with Balena" (DWB) example (link) to try. Worked for me first try, which was surprising and exciting. However, I ran into a couple issues (that might just be from running the gw in Docker, since that's not the way I normally run the gw.)

Instructions to create the DWB config are here. Result: https://github.com/balena-io-playground/webthing-gateway

Things I noticed (that might just be Docker deployment limitations):

Note that for this test you must create your own (free up to 10 devices) BelenaCloud account. For a non-revenue-generating open source project like the WebThings Gateway for personal use, Balena seem willing to allow an "open fleet" approach (instead of the current requirement of a BalenaCloud account per user approach). With open fleet, there is still new development to be addressed that would enable the community to securely release/manage auto-upgrade availability, but it seems worth looking into further, and providing a list of "what would be great to have" to the Balena team.

createcandle commented 3 years ago

In our case, the only infrastructure is the pi itself. Thus my idea to let the gateway be in charge of the update process.

That sounds great from a privacy perspective.

madb1lly commented 3 years ago

Hi all,

I'm sure that at least one of you has looked into it, but is anyone at the Raspberry Pi Foundation/Trading working on adding unattended upgrades to Raspberry Pi OS? I imagine with the new Pi 400 that these are computers which may be left plugged in most of the time and it would ease school IT admin burden if they could all be updated OTA rather than the IT admin having to reflash SD cards. If we're talking to Balena and considering how to bend FedoraIoT or Ubuntu Core to our needs then it might be worth asking Raspberry Pi if they're working on something which could give us the easiest solution.

About the Fedora more open sentiment - I also get this impression, but I also get the impression that there is a sort of propaganda war going on between the Ubuntu and Fedora communities, specifically Fedora trying to give the impression they are more open. I'd be wary about assuming that one or the other is more open or that either business model means they are more or less dependable.

Cheers 🙂

tim-hellhake commented 3 years ago

Marc on the Balena team created the below "Deploy with Balena" (DWB) example (link) to try

Nice!

yet to test Zigbee/Z-Wave dongles

The first USB serial device is already mounted into the container so this should work.

nor have I tried discovering/managing any other smart home device

They use port mappings at the moment. So normal unicast traffic shouldn't be a problem but if the device relies on multicast discovery such as mDNS or SSDP this won't work until they switch to host mode.

at least one add-on (Seashell) not available to be added

This was probably due to the architecture. According to the addon-list seashell is only for linux-arm.

Balena seem willing to allow an "open fleet" approach (instead of the current requirement of a BalenaCloud account per user approach). With open fleet, there is still new development to be addressed that would enable the community to securely release/manage auto-upgrade availability, but it seems worth looking into further, and providing a list of "what would be great to have" to the Balena team.

I wonder how the Deploy with Balena works without the Balena account. They need to somehow provision the OS.

benfrancis commented 3 years ago

@flatsiedatsie wrote:

They actually mention Webthings Gateway as a target for their platform

Huh, well spotted.

It seems in general 'provisioning', and thus putting the cloud back in control, is inevitable?

As I understand it the only one of the options we're exploring that really requires a "provisioning" step is Balena OS. With Ubuntu Core, Debian + Docker, Raspbian + .tar.gz and Fedora IoT my understanding is that we can just distribute an image and have devices pull updates from a server anonymously. I'm not sure how the "open fleet" approach for Balena OS would work.

@tim-hellhake wrote:

I'm not sure if that is new, or I just missed it, but there is a section about automatic updates in the FedoreIoT wiki.

Interesting. I note that the linked blog post from 2018 makes it sound like some of these features were still experimental. It would be good to do some testing to find out how well it works today.

Presumably these automatic updates only apply to the the base operating system though, not applications inside containers running on top? So that wouldn't answer the question of how to automatically update (with automatic roll back) containers. With Ubuntu Core both the core OS and application are snaps so use the same system.

@kgiori wrote:

Marc on the Balena team created the below "Deploy with Balena" (DWB) example (link) to try.

Cool.

For a non-revenue-generating open source project like the WebThings Gateway for personal use, Balena seem willing to allow an "open fleet" approach

That's not really a sustainable constraint for us to operate under.

@madb1lly wrote:

I'm sure that at least one of you has looked into it, but is anyone at the Raspberry Pi Foundation/Trading working on adding unattended upgrades to Raspberry Pi OS?

As I understand it Debian's unattended upgrades should theoretically work on Raspberry Pi OS too, but they are broken out of the box in Buster. It's possible we could fix that in our own image. Unattended upgrades using apt won't have all the benefits of some of the more advanced systems we're exploring which offer atomic upgrades, automatic rollback etc. and I wouldn't be surprised if they ocassionally require manual intervention.

All of this requires prototyping.

createcandle commented 3 years ago

Probably not news to you, and I'm not sure I'm reading this right, but it seems Github can generate customised Raspberry Pi images? A video showing it: https://www.youtube.com/watch?v=Lc6wvHgMYH4

If this is true, would it technically be possible to have the entire process of generating a Raspberry Pi image automated on github?

createcandle commented 3 years ago

Found the link of someone who tested Fedora on a Pi 4: https://discussion.fedoraproject.org/t/fedora-iot-edition-on-raspberry-pi-4/13680/4

kgiori commented 3 years ago

For a non-revenue-generating open source project like the WebThings Gateway for personal use, Balena seem willing to allow an "open fleet" approach

That's not really a sustainable constraint for us to operate under.

The Balena "open fleet" product definition is still a work in progress, so there are no rigid constraints to worry about yet. I'd argue that a personal WebThings Gateway deployment model via DWB is a good potential use case for Balena to consider when designing the future of open fleet. High-level goals off the top of my head:

What else?

createcandle commented 3 years ago

I've installed unattended upgrades on a 1.0.0 image, and will keep it running for the foreseeable future. If there are specific or better tests I can run to test it, let me know.

tim-hellhake commented 3 years ago

What I found out about FedoraIoT so far: It's pretty much a standard fedora distro that uses rpm-ostree to manage the root file system. rpm-ostree allows you to do atomic updates and manages multiple root fs to provide a rollback mechanism if the update fails. The root fs is read-only while user data (/usr and /etc) is persistent across updates and rollbacks. Additionally, Podman is installed as a container runtime. But that seems to be about it. Launching and managing containers is a manual task and is not part of the update process. If we really want to supply OTA updates of the system and the gateway we may need to ignore the container runtime and provide ostree commits instead. I tried to find out if that's feasible, but the information on how to build commits for rpm-ostree is scattered around various places. FedoraIoT itself is built with coreos-assembler using the fedora-coreos-config.

tim-hellhake commented 3 years ago

About OpenBalena: I stopped pursuing this because it looks like a dead-end for our use case. If we would host an OpenBalena instance we would manage all gateways under a single tenant and would have ssh access to all gateways. Additionally, we could push arbitrary containers to all gateways. I hope Balena's open fleet model will allow us to push updates without having actual access to the instances.

flatsiedatsie commented 3 years ago

Why is F34 the Most Popular Fedora Linux in Years?

As an experiment, I installed unattended upgrades on a 0.8 image of the gateway, and left it overnight. The next day it was up to date. So that seems to work fine. It hadn't updated to the next mayor version though.

Unattended upgrades has a feature where it updates with small increments, so that you can interrupt the process for a reboot at any time, and it will just continue when the device is rebooted.

// update: I followed this guide to attempt a mayor version upgrade. Worked fine, and no errors in the internal logs. The proces was even interupted twice accidentally, but the gateway continued to work.

There were two moments where user input was required: mayor_upgrade_input_required

Perhaps an install script could simply echo N twice? That has worked with similar situations for me in the past.

Of course all this is just anekdotal evidence. But maybe interesting nonetheless.

// Did notice this:

pi@upgrades:~ $ sudo apt-get upgrade
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Calculating upgrade... Done
The following packages have been kept back:
  libboost-python-dev libboost-thread-dev
0 upgraded, 0 newly installed, 0 to remove and 2 not upgraded.
Utopiah commented 2 years ago

IMHO it depends on the audience.:

For newcomers, image made for the RPi are enough. Newcomers will not care much for how it is actually done (Docker on Raspbian/Debian? As long as the interface works...) and probably don't even know alternative hardware exist. That implies though that the images are up to date or that add-ons do not break with out of date images. Right now, the Docker image and the release is nearly 1 year old, while the official build process is probably off https://github.com/WebThingsIO/gateway/issues/2873 .

For anybody else I believe having a common point for a reproducible environment is the most important. The lower level with the largest support the better because it means better integration with existing tools. For example I imagine nvm to be easier if hardware requirements access are needed, otherwise Docker/podman containers now that hardware has became so small, powerful and affordable that containerization is, arguably, not a significant performance limitation. Anyway for this type of user I'd argue that it is mostly documentation on how to get running that is more important, as long as the build process does not involve an obscure tool with a license that is morally reprehensible.