rauc / rauc

Safe and secure software updates for embedded Linux
https://rauc.io
GNU Lesser General Public License v2.1
825 stars 201 forks source link

Updating appfs does not change the boot order #360

Open stobajas opened 5 years ago

stobajas commented 5 years ago

Hi ! I'm using rauc version v0.1.1 present in the meta-rauc for Yocto Pyro version.

I have a similar configuration to this one:

[system]
compatible=FooCorp Super BarBazzer
bootloader=uboot

[slot.rootfs.0]
device=/dev/sda0
type=ext4
bootname=system0

[slot.rootfs.1]
device=/dev/sda1
type=ext4
bootname=system1

[slot.appfs.0]
device=/dev/sda2
type=ext4
parent=rootfs.0

[slot.appfs.1]
device=/dev/sda3
type=ext4
parent=rootfs.1

I have multiple slots and my use case is:

For that, before each install, i'm mirroring the filesystems of the all the slots and then overwrite with the bundles.

The problem i'm facing is that when i am updating a bundle with only the appfs slot, Rauc does not modify the BOOT_ORDER u-boot parameter. I thought the "parent" parameter linked to rootfs's bootname in the appfs slot would be modify it.

Do you have faced already this issue and patched it in the newer versions ?

Thanks for your comments.

stobajas commented 5 years ago

After a little investigation, in the v0.1.1 version, all happens in the function launch_and_wait_network_handler in src/install.c

    /* Mark all parent destination slots bootable */
    g_message("Marking slots as bootable...");
    g_hash_table_iter_init(&iter, target_group);
    while (g_hash_table_iter_next(&iter, NULL, (gpointer *)&slot)) {
        if (slot->parent || !slot->bootname)
            continue;

        res = r_boot_set_primary(slot);

        if (!res) {
            g_warning("Failed marking slot %s bootable", slot->name);
            goto out;
        }
    }

Should i change with this patch ?

        g_message("Marking slots as bootable...");
        g_hash_table_iter_init(&iter, target_group);
        while (g_hash_table_iter_next(&iter, NULL, (gpointer *)&dest_slot)) {
-               if (dest_slot->parent || !dest_slot->bootname)
+               if (!dest_slot->parent || !dest_slot->bootname)
                        continue;

                res = r_boot_set_primary(dest_slot);

Or maybe it is better to set dest_slot->bootname of the parent slot at the initialization ?

Thanks for your comments

ejoerns commented 5 years ago

Hey, some things about this:

1.) We strongly recommend not starting development with software that is obsolete! This is true for both RAUC and especiall for Yocto pyro release: https://wiki.yoctoproject.org/wiki/Releases

Updating has always something to do with 'keeping your software up to-date', fixing bugs, CVE's, etc. You won't have any chance to face these challenges when bein obsolete before starting ;) Apart from the fact that everything is more easy and fancier in newer yocto / RAUC releases!..

2.) About what you wrote concerning your findings: Do you really use the RAUC 'network' mode? Or did you just identify this as the probably relevant code line? Note, you do not use the 'network' mode when using bundles! I am asking, because we just started officially deprecating this...

3.) The target group selecting and handling changed quite a lot between 0.1.1 and current release 0.4. I haven't tested yet, but form the basic algorithm you should not run into the same issue

4.) As you might have read in the docs, we strongly recommend to let a bundle always descirbe the entire system, i.e. including both rootfs and appfs to ensure having a consistent and well-tested state of all software compontents. Note that RAUC provides some mechanisms for skipping slots with content that is equal to those it aims to install, or consider casync support which is in 0.4 already and allows you to automatically feed chunks from the source slot that equal those that needs to be installed to the target slot (see https://rauc.readthedocs.io/en/latest/advanced.html#rauc-casync-support). It is explititly designed for use-cases where you want to minimize the amount of data to download, and it will be probably result in even lower amount of data than what your approach allows.

stobajas commented 5 years ago

Hi, 1) No need to convince me that it is really important to keep things up to date, but our current context are those system versions and right now, I can't unfortunately change many things about that ;).

2) about my findings, i have done some grep and i found that code. I'm currently testing it. I just wanted to have your advice, you're the most qualified to know your code :).

3 / 4 ) Later, i have planned to bump the rauc version to be able to use casync support, but as I said, the product needs to be shipped soon and it is not the priority. So I'm looking for a solution/ ugly patch for the v0.1.1 version.

Thanks for your response.

ejoerns commented 5 years ago

You didn't answer my question about the bundle vs network mode.

In case you use the normal bundle mode, you have to look at a different code line, but it is pretty much the same content:

https://github.com/rauc/rauc/blob/v0.1.1/src/install.c#L883

If I remember correctly, the issue is, that your rootfs slot is not in the target group, and thus will not be covered by the iteration.

My recommendation for a quick fix would be to just bump the RAUC version in a bbappend file, should only be 2-3 lines.

stobajas commented 5 years ago

Ok my bad, you are totally right about the right line. The function that is currently called is launch_and_wait_default_handler and not the network mode. Currently:

I will try one last patch and then i will bump rauc ;)

stobajas commented 5 years ago

Hi again, i have done a bump to 0.4 rauc version, and the behaviour is still the same. Considering that i have a bundle with just the appfs and not the entire target and i'm not using casync support.

stobajas commented 5 years ago

Hi, i have done this patch and it works. Let me know if you find it useful or if the modifications are breaking other things and i will push a pull request. I am currently based on v0.4 tag.

From 139acc9404c1edcd1fc062a0997c2b60d15012fa Mon Sep 17 00:00:00 2001
From: Sandra Tobajas <sandra.tobajas@savoirfairelinux.com>
Date: Fri, 9 Nov 2018 11:38:50 -0500
Subject: [PATCH] install: add parent's bootname to its child slot

If a slot has no bootname but its parent does, then add the bootname in
the slot in order to change the boot order after a bundle installation.
---
 src/install.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/install.c b/src/install.c
index ee2220e..cfdf7cf 100644
--- a/src/install.c
+++ b/src/install.c
@@ -985,9 +985,13 @@ image_out:
        for (GList *l = install_images; l != NULL; l = l->next) {
            RaucSlot *dest_slot = g_hash_table_lookup(target_group, ((RaucImage*)l->data)->slotclass);

-           if (dest_slot->parent || !dest_slot->bootname)
+           if (!dest_slot->bootname && !(dest_slot->parent && dest_slot->parent->bootname))
                continue;

+           if (dest_slot->parent && dest_slot->parent->bootname) {
+               dest_slot->bootname = g_strdup(dest_slot->parent->bootname);
+           }
+
            mark_active(dest_slot, &ierror);
            if (g_error_matches(ierror, R_INSTALL_ERROR, R_INSTALL_ERROR_MARK_BOOTABLE)) {
                g_set_error(error, R_INSTALL_ERROR, R_INSTALL_ERROR_MARK_BOOTABLE,
-- 

Regards and rauc & roll :) !

ejoerns commented 5 years ago

I can confirm that this behavior is the same with v1.0-rc1 release.

@stobajas Your solution would workaround this, but it breaks other concepts of RAUC as it assigns a 'bootname' to a non-bootable slot. This might lead to other issues sooner or later.

I am not yet fully aware how a fix that does not break other scenarios could look like...

ejoerns commented 5 years ago

@stobajas Ok, my first approach I tried in #362 revealed some other drawback, because there may be multiple bootname slots in the target group, atm.

Basically, the idea behind your fix is the best I got for now, despite I would like to add some further fixing concerning the looping we have for now.

It might not make sense to call mark_active() more than once!

Thus I would propose you rework your path to do something like

if (dest_slot->parent)
    dest_slot = dest_slot->parent;

because then you do not change the bootname of your slot and can simply proceed as before. I have a similare line in my patches (see https://github.com/rauc/rauc/blob/28ba2d67f2e40d7d2996d2cf8f960e8839b20ee4/src/install.c#L359) and I could base my patch on your, then.

stobajas commented 5 years ago

Ok i will do that. I let you know when the PR is ready.

stobajas commented 5 years ago

Hi @ejoerns, can you take a look at https://github.com/rauc/rauc/pull/363 ? Maybe a unit test can be added to check an appfs installation, Thanks

ejoerns commented 5 years ago

@stobajas Thanks for the PR!

I had a short discussion with @jluebbe yesterday where we both agree that we do not want to encourage people using images that do not describe the entire system, i.e. as in your case, where we have a rootfs and appfs slot but only provide an appfs image.

The crucial question is now if we a) want to enforce requiring a rootfs slot b) add support for appfs-only but add a more explicit warning to not do this c) keep it as it is and document this

Thus the question to you would be if you have any argument for supporting this feature upstream or if it is an option to have this as an 'ugly patch' in your BSP and go for a real solution afterwards?

As you already mentioned, casync would be an option for you, but maybe there are also other ways to go?

As a quick guess I could imagine supporting rootfs dummy images that are in the manifest with the sha required but are not backed by a real image so that you do not 'waste' download volume...

stobajas commented 5 years ago

Hi, I can explain our RAUC use case, maybe it can give you some background of our reasons:

a) The rootfs and kernel are provided by Yocto, and in the appfs, there are all the applicative Docker containers already built in. Both bundles are not generated by the same team. The use case is that we want to let the choice to provide multiple bundle ( rootfs/kernel + appfs or only appfs or only rootfs/kernel)

b) As casync is currently experimental, we have not tested yet (and we have not considering it in our update model). So the use case, if we want to update only the appfs, before calling RAUC update, I developped a pre-install script that is simply doing an rsync between all filesystems to be sure that the entire system is up to date (always to reduce the amount of downloaded data). I'm well aware that this script is only working because we have double bank partition for all our slots.

c) For now, as long as i've tested, the "parent" option does not fit to the definition in your documentation, as I can in the v0.4 version update separately rootfs and appfs. So it could be mistaken for others too.

If multiple slots belong together in a way that they always have to be updated together with the respective other slots, you can ensure this by grouping slots.

I understand that you don't want to encourage people to do ugly things but in the industry, it is unfortunately often the case :D due to the lack of time and resources.

Let me know for the PR. In our case, the patch is already included in our meta-rauc is working well. Regards, Sandra

jluebbe commented 5 years ago

I'd like to stick with the behavior that only updates to slots with a bootname trigger the switch. My main reason is that I want keep the conceptual model that the whole group of slots should be updated.

They way I intend it to work is by implementing image-base delta updates: A delta bundle can omit the image file for any slot but still provide a hash. During installation, rauc can then find slot contents that should be reused (the rootfs in you case) or refuse an update. Conceptually, the rootfs is then part of the update and rauc can ensure consistency.

These delta bundles could be generated automatically by passing the old and new bundles to a (to be implemented) rauc command.

When that is implemented, we could also refuse updates of slots connected via parent links where only some of them are specified in the update (with a compatibility switch for "legacy" setups).

ejoerns commented 5 years ago

I've created #376 for keeping track of the image-based deltas.

And I've updated #362 to not handle parent slots anymore but still detect updates containing none or multiple bootname slots.

ejoerns commented 5 years ago

As things turn out to be more complicated and as not fixing this issue yet would not make anything worth than it is, I would like to remove that 1.0 flag from this. Any concerns? Any ideas how to clarify this in the documentation for now?

stobajas commented 5 years ago

No problem for the flag. As the whole group of slots should be updated together, i would suggest to specify it in the documentation and add also a check at install to verify if all the slots are present (otherwise raise an error).

For the check, i can do a PR because it is important for me to contribute to your project. What do you think about it ? You are doing a great job. Sandra

ufechner7 commented 4 years ago

Any update on this? Would casync solve this problem? We update the app partition probably 8 times as often as the rootfs, so we need an efficient way to do that.

ejoerns commented 4 years ago

@ufechner7 One solution would be to use delta-update capabilities of casync. Another apporach is #376 that could fit your case but has some limitations. Currently, it is not possible (and badly handled) to simply have a bundle that contains only an appfs.

The question here is what you intend to optimize. When bundle size is not an issue but update time / writing effort is: you could enable the automated slot skipping and place both rootfs and appfs in the bundle. If the rootfs hash already matches the one expected, the installation will be skipped. This is mainly applicable for read-only rootfs where you do not expect the content of rootfs to change ever after having written it.

The image-based delta bundles noted above is an optimization for bundle size of the same process. The idea is you only note the expected hash of the rootfs to be present for applying the appfs update and do not place any rootfs image in the bundle at all.

The casync approach would also work for having the application inside the rootfs as it is a simple self-synchronizing chunk-based binary delta. But this requires operations to happen over the network and a server that can provide access to the chunk store.

Another option would be to finally implement delta bundles in RAUC itself, e.g. file-based. The drawback of having delta bundles compared to having casync is of course that you are tied to a specific install state of the target that allows the delta to match.

Would this solve your questions?