fedora-silverblue / issue-tracker

Fedora Silverblue issue tracker
https://fedoraproject.org/atomic-desktops/silverblue/
125 stars 3 forks source link

Unable to rebase to Fedora 40 due to SELinux definition errors. #554

Closed jbirch closed 2 months ago

jbirch commented 2 months ago

Describe the bug On my current machine, I am unable to rebase from fedora/39/x86_64/silverblue to fedora/40/x86_64/silverblue, regardless of layered packages or not. The rebase command completes successfully, but the finalise process fails due to an SELinux redefinition error:

$ journalctl -b -1 -u ostree-finalize-staged.service
Apr 23 20:28:49 nangou systemd[1]: Finished ostree-finalize-staged.service - OSTree Finalize Staged Deployment.
Apr 23 20:29:01 nangou systemd[1]: Stopping ostree-finalize-staged.service - OSTree Finalize Staged Deployment...
Apr 23 20:29:02 nangou ostree[4370]: Finalizing staged deployment
Apr 23 20:29:03 nangou ostree[4370]: Copying /etc changes: 56 modified, 0 removed, 87 added
Apr 23 20:29:03 nangou ostree[4370]: Copying /etc changes: 56 modified, 0 removed, 87 added
Apr 23 20:29:03 nangou ostree[4370]: Refreshing SELinux policy
Apr 23 20:29:05 nangou ostree[4379]: Re-declaration of type virt_bridgehelper_t
Apr 23 20:29:05 nangou ostree[4379]: Previous declaration of type at /etc/selinux/targeted/tmp/modules/100/virt_supplementary/cil:5
Apr 23 20:29:05 nangou ostree[4379]: Bad type declaration at /etc/selinux/targeted/tmp/modules/100/virt_supplementary/cil:5
Apr 23 20:29:05 nangou ostree[4379]: Failed to build AST
Apr 23 20:29:05 nangou ostree[4379]: semodule:  Failed!
Apr 23 20:29:05 nangou ostree[4370]: Refreshed SELinux policy in 2593 ms
Apr 23 20:29:05 nangou ostree[4370]: error: Finalizing deployment: Finalizing SELinux policy: Child process exited with code 1
Apr 23 20:29:05 nangou systemd[1]: ostree-finalize-staged.service: Control process exited, code=exited, status=1/FAILURE
Apr 23 20:29:05 nangou systemd[1]: ostree-finalize-staged.service: Failed with result 'exit-code'.
Apr 23 20:29:05 nangou systemd[1]: Stopped ostree-finalize-staged.service - OSTree Finalize Staged Deployment.
Apr 23 20:29:05 nangou systemd[1]: ostree-finalize-staged.service: Consumed 3.214s CPU time.

This happens from both deployments here — that is, it happens even with an rpm-ostree reset (The ostree-finalize-staged output above is from the reset deployment):

$ rpm-ostree status
State: idle
Deployments:
● fedora:fedora/39/x86_64/silverblue
                  Version: 39.20240423.0 (2024-04-23T00:42:57Z)
               BaseCommit: 3ec7ccfe318969ff8ae4d49253dd560b84ca5ad81554e69ecc6fd2dd2b664e3f
             GPGSignature: Valid signature by E8F23996F23218640CB44CBE75CF5AC418B8E74C
          LayeredPackages: emacs-nox gnome-tweaks gparted htop irssi langpacks-en levien-inconsolata-fonts libusb1-devel lm_sensors lshw powertop pv strace wavemon zsh

  fedora:fedora/39/x86_64/silverblue
                  Version: 39.20240423.0 (2024-04-23T00:42:57Z)
                   Commit: 3ec7ccfe318969ff8ae4d49253dd560b84ca5ad81554e69ecc6fd2dd2b664e3f
             GPGSignature: Valid signature by E8F23996F23218640CB44CBE75CF5AC418B8E74C

To Reproduce On my local machine, after ensuring a reboot:

  1. sudo rpm-ostree rebase -b fedora/40/x86_64/silverblue — apparently successful
  2. systemctl reboot — apparently successful
  3. Observe the system is still in Fedora 39, as evidenced by /etc/*release* and rpm-ostree status

However, I was unable to reproduce this on two fresh virtual machines — one with no layered packages and one with my chosen layered packages. The above steps successfully upgraded the virtual machines to Fedora 40

Expected behavior After a successful rebase command and successful reboot, I expect to be in Fedora 40.

OS version:

$ rpm-ostree status -b
State: idle
BootedDeployment:
● fedora:fedora/39/x86_64/silverblue
                  Version: 39.20240423.0 (2024-04-23T00:42:57Z)
               BaseCommit: 3ec7ccfe318969ff8ae4d49253dd560b84ca5ad81554e69ecc6fd2dd2b664e3f
             GPGSignature: Valid signature by E8F23996F23218640CB44CBE75CF5AC418B8E74C
          LayeredPackages: emacs-nox gnome-tweaks gparted htop irssi langpacks-en levien-inconsolata-fonts libusb1-devel lm_sensors lshw powertop pv strace wavemon zsh

Additional context It's not immediately clear to me where the duplicated definitions are coming from, as the /etc/selinux/targeted/tmp/ directory doesn't persist long enough for me to understand where it's finding its magic. This system started as a Silverblue 38 machine, and rebased to 39 without incident.

travier commented 2 months ago

Can you try resetting you SELinux policy as well? https://docs.fedoraproject.org/en-US/fedora-silverblue/troubleshooting/#_selinux_problems

jbirch commented 2 months ago

Oh, fantastic lead. I indeed have some modifications!

$ sudo ostree admin config-diff | grep policy
[sudo] password for jbirch: 
M    selinux/targeted/.policy.sha512
M    selinux/targeted/active/policy.linked
M    selinux/targeted/active/policy.kern
M    selinux/targeted/policy/policy.33

Following along with the instructions results in:

$ sudo ostree admin config-diff | grep policy
A    selinux.bak/targeted/.policy.sha512
A    selinux.bak/targeted/policy
A    selinux.bak/targeted/policy/policy.33
A    selinux.bak/targeted/active/policy.linked
A    selinux.bak/targeted/active/policy.kern
A    selinux.bak/targeted/active/modules/100/policykit
A    selinux.bak/targeted/active/modules/100/policykit/cil
A    selinux.bak/targeted/active/modules/100/policykit/hll
A    selinux.bak/targeted/active/modules/100/policykit/lang_ext

(IE, a bunch of things were backed up, but the actual SELinux policy aligns with what's in the tree)

I'll move these off-tree and report back about the status of an upgrade.

jbirch commented 2 months ago

This was enough to upgrade to Fedora 40. I'll keep the config-diff in my back pocket for the future; it was definitely the missing piece I needed to identify the issue.

I don't recall what I could have possible done to modify SELinux definitions, but given that I've not been able to find anyone else on the internet yet who has run into the same thing, I'm willing to admit it was probably my own fault. I appreciate your help getting this resolved — there's probably nothing further to do here. Thanks again!

travier commented 2 months ago

There was a bug at some point that made the policy "local" for some systems so you might have hit it.

Glad it got resolved.