AsteroidOS / asteroid

Build script for AsteroidOS, an open-source operating system for smartwatches
http://asteroidos.org
GNU General Public License v2.0
869 stars 64 forks source link

OPKG not reliable with updates #224

Open dodoradio opened 1 year ago

dodoradio commented 1 year ago

On 2022/09/03, a number of users attempted updates to packages released in the latest nightlies. This led to a number of issues - eventually leading to bootlooping devices. This is not the first time that this has happened, with at least one previous instance recorded in #217 , and it seems that this is happening because of OPKG and the IPK package format. OpenWRT recommends against using OPKG for core system component updates, as it is known for causing system instability. https://openwrt.org/meta/infobox/upgrade_packages_warning None of the above issues were encountered when an update was installed by just reflashing the entire OS, however this results in all of the user's data being wiped as well. @FlorentRevest has said on matrix that it is worth looking critically at how the operating system should develop updates in the future, as it is worth considering the embedded nature of the platform. Yocto supports both RPM and DEB as alternative packaging formats.

FlorentRevest commented 1 year ago

Switching to RPMs or DEBs is pretty "cheap" (in terms of time) because it does not diverge very much from our current model and only needs a couple of lines of changes to meta-asteroid (and documentation updates) so we can probably pull this off in an afternoon of work and if we know this has immediate benefits then let's just do it.

I suggested on chat that this would be a good time to think of alternative solutions like https://ostreedev.github.io/ostree/ for example (also used by QtOTA) it's a somewhat different model from the traditional desktop package manager but has super cool features like safe rollbacks (let's say an update breaks the machine, it can atomically revert back to the last state that was known to work) mender is also a name that I saw many times in the embedded world and worth considering https://mender.io

At some point I wrote notes on embedded Linux update mechanisms here https://pelux.io/pelux-architecture/master/chapters/configurations/software-update-management.html especially, it has a list of solutions which was pretty exhaustive at the time

Just to keep the message clear, I'm not affirming we have to use one of these. All I'm saying is we should give some thoughts about the different options, analyze the trade offs, see what fits Asteroid best and take a long-term decision based on a rationale.

Opkg has never been a long term decision based on careful thinking so it's not too surprising that we realize retrospectively that it doesn't fit our project. :)

dodoradio commented 1 year ago

As far as I understand, system stability is a very high priority for embedded systems in the commercial world, mainly because these systems do not come with the option of being fixed by the client if the software fails (both to restrict the client's use of the device, and because the device may lack the interface to do so) - AsteroidOS is unlike this in that the user has access to a shell and can fix it if it is broken by an update, also because it is not used for mission-critical applications, and also because it seems to orient itself towards hackers and tinkerers (eg. in the slogan 'hack your wrist'), who would generally prefer hackability over guaranteed reliability. As a result of this, I don't think that it should be treated like most other embedded devices. Hackability is an important pillar of software freedom because it allows users to build their own custom setup on top of an existing distribution.

A good example of an 'embedded' distribution is OpenWRT - this is a router distribution with very little user interaction, where stability and uptime are important - this sidesteps the update problem by requiring full system upgrades in the form of reflashing. Once reflashed, the user can mess with system components in whatever way they want, until the next update.

Reading a bit about Pelux, I come to the conclusion that it also prioritises stability, but does not have hackability as a requirement.

Fedora OSTree variants (and derivatives such as LiriOS silverblue) all prioritise stability above everything else, and every application is assumed to run in a container - this is very much unlike AsteroidOS. One of these is even an IOT platform, which AsteroidOS definitely isn't. They also sacrifice some hackability, but to a much lesser level.

Finally, a level of complication comes with such update mechanisms in deciding what will be a 'core component' - do we include the launcher and all the base apps? (but what if the user wants to uninstall a base app because they do not find it useful?) Do we include all the necessary Qt frameworks and libasteroidapp? (then the user will still need to use a conventional package managed to update half of the system - suddenly, asteroid has two package managers) do we include everything up to libhybris? (maybe, but how often do push android base updates? also, same issue of two update systems is present). There's a tradeoff - if we want to make sure that the system always boots to the launcher and always provides a UI, then a lot of liberty and hackability may be lost. If we prioritise anything lower, then the user will still generally need shell experience in order to fix any broken updates, so how much reliability is actually gained?

I'd support switching to a different conventional package manager. This makes AsteroidOS familiar to users of other Linux distributions, in that it mostly behaves like a desktop system. It also means that users can individually roll back (or update) programs, allowing more customisation and making it easier for experienced users to fix issues. I have no opinions on rpm vs. deb

eLtMosen commented 1 year ago

Being a mere user in that aspect, all i could safely contribute is that indeed:

FlorentRevest commented 1 year ago

Alright fair enough, thanks for the analysis and arguments! :)

Now it would be interesting to have some elements to decide between deb and rpm which are the two options we have easily available from OE

MagneFire commented 1 year ago

We currently suffer from at least two different update issues:

Changing our package format might solve the latter issue, but not the former as the package isn't updated in the first place. Whether that solves the latter issue is yet to be seen.

It's also worth pointing out that a broken update might also result in a system that doesn't boot at all, meaning that you won't get access to any shell since that might be broken as well. This would force you to do a full reflash.

As pointed out by @PureTryOut over on Matrix, the quality of the packaging can have a huge impact on the reliability of update system (don't know the exact phrasing, correct my if I'm wrong :wink: ). And I think that this might be the reason why our update system is not close to the reliability of the desktop distro's.

As for a solution, I'm not sure which direction to take here at the moment. As mentioned before, changing the package format is a relatively straightforward thing to do.

beroset commented 1 year ago

The set of associated tools for both developers and for users is much more important than the package format (.rpm vs .deb). The package format can be automatically converted (e.g. with tools like Alien). Further, even our existing .ipk files can represent dependency information - are we using that as effectively as we could?

I think the bigger issue here is what is generally called release management; the set of policies and procedures that are used to assure that packages are complete and coherent. For instance, are we going to try to implement some kind of Quality Assurance (QA) testing and evaluation? With a relatively small team of volunteer maintainers, it seems to me that the only sustainable and scalable approach would be automation for most QA functions, perhaps as outlined in PR #203 but perhaps with an alternative set of tools.

Since our full rebuilds and the resulting image file seem not to break most of the time (although there are exceptions), a useful alternative/backstop might be as @eLtMosen suggests, which is to create a backup/restore mechanism for settings.

eLtMosen commented 1 year ago

Just to add to this issue and bump it, i recently got aware that openWRT also has problems with opkg and strongly advises against using it to update single packages. Their workaround is a custom systme called ASU https://github.com/openwrt/asu ASU is based on an API to request custom firmware images with any selection of packages pre-installed. This avoids the need to set up a build environment, and makes it possible to create a custom firmware image even using a mobile device. Some details to the opkg update warning here https://openwrt.org/meta/infobox/upgrade_packages_warning

dodoradio commented 1 year ago

Even if it is difficult to verify whether issues are explicitly caused by OPKG or by our own packaging practices, it seems to be mostly unmaintained. There seems to be an active push to deprecate some convoluted bits of sailfish code, such as timed or lipstick. These, however, are actively maintained by both us, jolla and nemomobile-ux. With the same attitude, we should be pushing to deprecate OPKG as well, since it is not actively used and maintained by any other project. As the maintenance burden increased, OPKG also dropped features - for example, the packagekit support. There was talk about us picking up the maintenance of the packagekit support, but that is another unnecessary maintenance burden and should probably be taken as another reason to drop OPKG.

beroset commented 2 months ago

Looking at this today, if we were to change from IPK files to either RPM or DEB, it seems to me that we should choose RPM. One reason is that DEB doesn't fully support one of the Yocto variables we use. Specifically we have this line:

meta-asteroid/recipes-core/image/initramfs-android-image.bbappend:BAD_RECOMMENDATIONS += "busybox-syslog"

The BAD_RECOMMENDATIONS variable is to specify any "recommended only" packages that we do NOT want to install. However, the yocto documentation says:

[This variable] only works with IPK and RPM package types, not for Debian packages.

See https://docs.yoctoproject.org/dev/dev-manual/packages.html for details.