fedora-silverblue / issue-tracker

Fedora Silverblue issue tracker
https://fedoraproject.org/atomic-desktops/silverblue/
126 stars 3 forks source link

Since e2fsprogs-1.47.0-1.fc39 landed in Rawhide, cannot rebase from a fresh Rawhide Silverblue install to any earlier release #470

Closed AdamWill closed 7 months ago

AdamWill commented 1 year ago

This issue tracker is intended only for Silverblue specific issues. We would like to ask you to try to reproduce the issue on a relevant Fedora Workstation release. If you will be able to reproduce there, then please report it in Red Hat Bugzilla (see How to file a bug) or in upstream (preferred for GNOME projects) and not in this issue tracker.

Describe the bug e2fsprogs-1.47.0-1.fc39 seems to create ext4 filesystems with some new feature. This means earlier versions of e2fsprogs can't fsck them. Because an fsck on the /boot partition is a required element during system startup, if we rebase from a Rawhide Silverblue install to any earlier release, boot fails because fsck of /boot fails.

To Reproduce Please describe the steps needed to reproduce the bug:

  1. Install a recent Fedora Rawhide Silverblue (this is actually a bit tricky as official compose installer image build is currently failing; I'm seeing this in openQA, which builds its own ostrees and installer images). You need to use an installer with e2fsprogs-1.47.0-1.fc39 to see the issue.
  2. Rebase to an earlier release
  3. Try and boot

Expected behavior The system should boot successfully

Screenshots 38fail

@sandeen is there anything we can do here? can we somehow have e2fsprogs not create non-backward-compatible filesystems, or are we stuck with it?

AdamWill commented 1 year ago

Per https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1031622 , this is due to the 'orphan_file' feature that is enabled by default in 1.47.0. It seems any earlier version of e2fsprogs cannot fsck a filesystem which was created with the orphan_file feature turned on.

Could we perhaps disable it by default for a release or two, @sandeen , to avoid problems like this? Maybe only make it default once F39 is the oldest supported release?

travier commented 1 year ago

This is a "pure" downgrade scenario as this will not impact users that upgrade from an older release to rawhide and then go back to a stable release.

I don't think we should care about that. We should probably change the test to do something else.

AdamWill commented 1 year ago

well, I could see someone installing Rawhide to try it out, encountering a bug, and wanting to just rebase down to 38 to avoid the problem, but OK. There is nothing else the test can do, that I can think of. The requirement is "check that rebasing from Rawhide" works. I'm not aware of anything else the test can rebase to that would work.

travier commented 1 year ago

We should support the reverse: Installing F38 and then updating to Rawhide and then rollback to F38.

I don't think we can ever support installing from a newer Fedora release and then downgrading to an older release.

AdamWill commented 1 year ago

The point isn't to test any specific rebase operation, but to test that the rebase mechanism itself works. If we don't test a rebase from Rawhide to something, the rebase code in Rawhide could be broken for months and we would never know until it branches.

travier commented 1 year ago

Yes, for that test to mean something, we need to build a new image or a layered one and rebase to it.

AdamWill commented 1 year ago

For update tests at least it might be possible to test rebasing to the 'stock' Rawhide (not the custom ostree we build for testing the update). That wouldn't work for compose tests, though, and I don't really want to have to stuff another image creation step in there (it takes a long time).

Anyhow, I've filed an upstream issue to see if the sudden cut-off here was intended - https://github.com/tytso/e2fsprogs/issues/147

jlebon commented 1 year ago

For testing the rebase mechanism, we sometimes create a branch locally and rebase to it. E.g. https://github.com/coreos/rpm-ostree/blob/5cc3c84b97199abbd72268dd4a4f963a6215cc0d/tests/vmcheck/test-layering-unified.sh#L75-L91

It's synthetic and not super realistic (e.g. it doesn't test remote bits), but it's still better than nothing.

AdamWill commented 1 year ago

well, while we're talking workarounds, is it possible to overlay after rebasing? or if I overlay, then rebase, what happens to the overlay?

I'm thinking, of course, about overlaying a newer e2fsprogs build onto the rebased system. For now I can work around this by having the 'create the installer' step of the test use a scratch build of e2fsprogs 1.47.0 with the feature disabled, but that's somewhat fragile (any time e2fsprogs is updated, I have to redo the scratch build).

AdamWill commented 1 year ago

FWIW, I figured out a better generic workaround for this for the update tests at least. The update tests build their own ostree (with the update packages included), so I've made that test give it a unique ref, not just name it the same as the official ref it's based off. So now we can verify that we installed our custom ref, and test rebasing to the official ref (we don't need to rebase to a different version).

This doesn't help the compose tests, which by design are testing the official images that deploy the official refs. But at least it solves the problem for update tests...

ziswiler commented 7 months ago

I am a long-time Silverblue user and find this bug very very inconvenient. I just got myself a new notebook. Installed 39 on it. Unfortunately, there are some other issues with 39 so I thought, well, it's Silverblue, I just re-base to whatever and be done with it. Nops, re-basing to 38 doesn't even boot! WTF! BTW: Re-basing to Rawhide works but that has the same other issue so won't help me any...

sandeen commented 7 months ago

Uff, I had missed these notifications. Too many bug/issue trackers.

Sure, you can turn on or off any on-disk features you want, if you want to override mkfs defaults, something anaconda normally doesn't want to do. But as others have said, rebasing down from rawhide doesn't seem like a normal (or even supported?) workflow. I don't know if mkfs parameters can be passed via kickstart; if so, that might be one option.

It was a bit antisocial for upstream to introduce the orphan_file feature and make it default all in the same release. That said, rawhide is supposed to be the place for the latest & greatest package versions and features. It's not a given that those packages will always be downgradeable to prior versions ...

In i.e. XFS land, we'd add new features as available but non-default for several releases before making it default, to avoid this sort of problem and at least allow backwards compatibility across a few older versions.

ziswiler commented 7 months ago

Note that this has nothing to do with Rawhide. This is a serious production issue as 39 can NOT be downgraded to 38.

travier commented 7 months ago

Rebasing to an older version is not supported: https://github.com/fedora-silverblue/issue-tracker/issues/470#issuecomment-1564647180

travier commented 7 months ago

Going to close this one as this is not something we can support. The e2fsprogs issue is tracked in https://github.com/tytso/e2fsprogs/issues/147.