OSInside / kiwi

KIWI - Appliance Builder Next Generation
https://osinside.github.io/kiwi
GNU General Public License v3.0
282 stars 142 forks source link

kiwi/bootloader: restore backward compatibility for grub2 with bls #2542

Closed Conan-Kudo closed 2 months ago

Conan-Kudo commented 2 months ago

The change to introduce the bls parameter broke backward compatibility with all existing kiwi descriptions for distributions that default to BLS.

This fixes that by allowing the unset state to be equivalent to enabling it.

Fixes: 8a8190098cb30358459ba10a4db1ba6446eee8c2

schaefi commented 2 months ago

ok I'm late here in this game but finally found the time to look at the work done in #2535 and the pull request here which tries to fix a regression.

First of all thanks for the effort done by @aplanas and the conversation around the topic. I'm afraid but I think PR #2535 should not have been merged so early. Let me explain my thoughts:

  1. The situation around GRUB_ENABLE_BLSCFG before the changes was simple: "set GRUB_ENABLE_BLSCFG if implemented in mkconfig"

  2. After the merge of #2535 we introduced a regression because now it does "set GRUB_ENABLE_BLSCFG if implemented in mkconfig and explicitly activated in the kiwi config, by default not activated"

So why are we doing this ? This is an implicit behavior change which we tried to avoid in kiwi from the beginning on because we know it will piss off people and they are right. There is actually not much of an excuse here and to make the impact more clear, this change impacts all other distributions in a negative way except for suse. I love suse but really this way of adding changes is one reason why there is also some distrust and negative words on our products, and I say "our" because I'm part of the suse family. So we can do better.

Here is a suggestion how to move forward:

@Conan-Kudo can you drive this change or do you want me to do it ?

Thanks

aplanas commented 2 months ago
* And last but not least, image builds that wants to disable grub BLS needs to specify that explicitly by setting `<bootloader ... bls="false" .../>` that really should not be that hard and also makes it pretty clear.

That was explicitly proposed as Fix#2 here: https://github.com/OSInside/kiwi/pull/2535#issuecomment-2079474621

This change needs to be done in all openSUSE / SUSE images.

schaefi commented 2 months ago

This change needs to be done in all openSUSE / SUSE images.

yep and if we can help the affected people/projects with this change it would be good, because I think they will probably not be informed. So I guess it affects all those distro targets which received the updated suse grub package ? Is there a way to tell/reach out to those ? It also means kiwi as a package needs to be released at best in collaboration with the release plans of grub

schaefi commented 2 months ago

@aplanas current maintenance of kiwi covers three code branches:

For which of the above targets did you plan the grub change ?

aplanas commented 2 months ago

So I guess it affects all those distro targets which received the updated suse grub package ?

Correct. The order should be:

  1. Release a new KIWI version in Factory with the change
  2. Update all the KIWI profiles in OBS. I know there are some templates in OBS when generating images, so I guess that those changes should be directed to certain github repos, but the important changes are for the images and appliances that are already in Factory (Tumbleweed itself, MicroOS, MicroOS appliances, etc).
  3. Once this is done, we can push the grub2-bls patch into factory

Is there a way to tell/reach out to those ?

In a general sense: no. The images that are in OBS can be searched (I guess). The ones outside OBS are not reachable.

This change will impose a regression on those images, so I propose to evaluate the idea of having a patch in the openSUSE package that keep bls="false" in KIWI for some years. We should make BLS as an opt-in, not as an opt-out as GRUB2 is not BLS.

It also means kiwi as a package needs to be released at best in collaboration with the release plans of grub

KIWI should be released first and all the OBS images updated second. Then a GRUB2 with the BLS patch can be done in Factory

aplanas commented 2 months ago

For which of the above targets did you plan the grub change ?

If all goes OK the order would be:

  1. Factory
  2. ALP
  3. SLE

But I cannot foresee this far : (

schaefi commented 2 months ago

This change will impose a regression on those images, so I propose to evaluate the idea of having a patch in the openSUSE package that keep bls="false" in KIWI for some years. We should make BLS as an opt-in, not as an opt-out as GRUB2 is not BLS.

Well, correct me when wrong but the regression is created when the grub package gets updated. kiwi as the dumb image builder will behave in the same way as before, meaning it places GRUB_ENABLE_BLSCFG into the config file as it did since this setting exists in the mkconfig tool provided by the distribution. It's been a long time ago but I believe I remember that there was an agreement to set this option when it exists in the tooling. I found it weird but that was the argument that was brought up. If the distro does not support BLS the mkconfig tool will have no code to handle GRUB_ENABLE_BLSCFG and so we added that weird search in the mkconfig code.

So am I correct when I say that this never was accurate for the suse distro ? Meaning our mkconfig has a handling for GRUB_ENABLE_BLSCFG but the bootloader is not really running as BLS enabled loader ?

Now you are changing that in grub such that it actually will do BLS in the intended way, but why should we now switch it off ? I understand it in a way that we finally move to BLS with grub like all others did and setting bls="false" is an exception for certain image types which wants this for whatever reason.

I'm confused

aplanas commented 2 months ago

This change will impose a regression on those images, so I propose to evaluate the idea of having a patch in the openSUSE package that keep bls="false" in KIWI for some years. We should make BLS as an opt-in, not as an opt-out as GRUB2 is not BLS.

Well, correct me when wrong but the regression is created when the grub package gets updated. kiwi as the dumb image builder will behave in the same way as before, meaning it places GRUB_ENABLE_BLSCFG into the config file as it did since this setting exists in the mkconfig tool provided by the distribution.

Right! But that is the point, isn't?

GRUB2 is not BLS, the patch is an opt-in decision, and using blscfg is totally and opt-in action. But KIWI enables it unconditionally instead of making it an opt-in feature.

KIWI is not a 'dump image builder' in this case: is making a decision.

I really understand why we do not want to tag this as a regression: kiwi did not change, ergo we did not regress any behavior! But for the user PoV it is: just because GRUB2 gained an opt-in feature that I am not using, now my image is broken.

Bare with me a bit and let try to characterize this:

I do not think that the issue is any of those three actors. I argue that the regression is in KIWI, as it broke the image that was not using BLS before forcing you to use blscfg if this feature is available in GRUB2.

What is more, before #2535 the user did not have any option to escape BLS, and if we make BLS an opt-out feature (Fix#2) we are introducing the regression: the user needs to update the kiwi file in order to continue on building the same image with the same features. That is why I propose to keep a patch in the KIWI package.

It's been a long time ago but I believe I remember that there was an agreement to set this option when it exists in the tooling. I found it weird but that was the argument that was brought up. If the distro does not support BLS the mkconfig tool will have no code to handle GRUB_ENABLE_BLSCFG and so we added that weird search in the mkconfig code.

IMHO the original patch introduced a more basic bug: there is no opt-in / opt-out of using BLS if GRUB2 has the BLS patch. That should be the red flag.

But in honesty I am sure that I would make the same decision and merge the feature.

So am I correct when I say that this never was accurate for the suse distro ? Meaning our mkconfig has a handling for GRUB_ENABLE_BLSCFG but the bootloader is not really running as BLS enabled loader ?

This is partially correct. Now GRUB_ENABLE_BLSCFG has a meaning in grub2-mkconfig but we do not want to use BLS by default.

Now you are changing that in grub such that it actually will do BLS in the intended way, but why should we now switch it off ? I understand it in a way that we finally move to BLS with grub like all others did and setting bls="false" is an exception for certain image types which wants this for whatever reason.

I am not sure where it is the confusion, but let me try to summarize some points:

I hope that in the long term we will use BLS in the distribution, but this is not my decision, as there are may other issues to fix in GRUB2 wrt BLS first: theming, missing BLS features, YaST installer, clarification with upstream, better tooling, etc. I will work on some of those issues.

So the status quo did not change: the current openSUSE images will not use BLS, but new selected images will use BLS. They also will include more tools that will generate those missing boot entries, and will take care of synchronizing those entries with the different snapshots.

For the short and medium term bls="false" will continue to be the default (the most common option as is today), but some images (very few at first) will be start using blscfg.

aplanas commented 2 months ago

Uhmm I am still thinking about your confusion.

Can be that grep -q GRUB_ENABLE_BLSCFG /etc/default/grub can imply that is activated? I mean, it is not as can be part of a comment as it was in the openSUSE case. Can a better detector fix the issue?

Has Fedora GRUB_ENABLE_BLSCFG="true" in the /etc/default/grub from RPM? Maybe we can search for the regex ^GRUB_ENABLE_BLSCFG="true", or remove the grep and use some python code?

Conan-Kudo commented 2 months ago

The grub2 package in Fedora does not provide a default /etc/default/grub.

Conan-Kudo commented 2 months ago

@schaefi:

@Conan-Kudo can you change the PR here and simply turn the return value for get_build_type_bootloader_bls() to True in case there is no bls attribute set. That resolves the regression

This is done.

We did not add a single line of documentation for the new bls attribute. There should be some docs on the attribute and its default value in the context of a user documentation not only the inline schema docs. I suggest to add that here: https://osinside.github.io/kiwi/image_description/elements.html#preferences-type-bootloader

This is done.

schaefi commented 2 months ago

This is partially correct. Now GRUB_ENABLE_BLSCFG has a meaning in grub2-mkconfig but we do not want to use BLS by default.

Thanks @aplanas that explains my confusion. I was always under the impression that we intend to make BLS the default and not only provide the functionality in grub. That's also why we are having the trouble. You are right kiwi makes a decision to enable BLS if the mkconfig tool has a code path for it. This decision has been done a long time ago and I consider it a mistake that we at that time did not simply create an attribute and let the user decide to use it or not.

We are now facing the situation that there is a distro providing grub2 with BLS support, but unwanted at the moment. We didn't had that before.

Carrying the patch in the spec of kiwi is probably the best solution for the suse package.