okd-project / okd

The self-managing, auto-upgrading, Kubernetes distribution for everyone
https://okd.io
Apache License 2.0
1.76k stars 297 forks source link

[4.12 upgrade] Node upgrade fails because of SELinux policies preventing `nm-dispatcher` from working #1475

Closed rassie closed 3 months ago

rassie commented 1 year ago

Describe the bug

Upgrading OKD from 4.11 to 4.12, I'm stopped by kubelets not starting on both master and worker nodes. The problem is the same: file /run/resolv-prepender-kni-conf-done does not get created, so that kubelet's pre-condition does not allow it to start. Logs are full of SELinux prohibiting nm-dispatcher to read NetworkManager's configuration:

Jan 22 21:50:02 okd-xwwxf-master-2 audit[1087]: AVC avc:  denied  { read } for  pid=1087 comm="nm-dispatcher" name="dispatcher.d" dev="sda4" ino=90264444 scontext=system_u:system_r:NetworkManager_dispatcher_t:s0 tcontext=system_u:object_r:NetworkManager_initrc_exec_t:s0 tclass=dir permissive=0
Jan 22 21:50:02 okd-xwwxf-master-2 audit[1087]: SYSCALL arch=c000003e syscall=257 success=no exit=-13 a0=ffffff9c a1=561faac18790 a2=90800 a3=0 items=0 ppid=1 pid=1087 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="nm-dispatcher" exe="/usr/libexec/nm-dispatcher" subj=system_u:system_r:NetworkManager_dispatcher_t:s0 key=(null)
Jan 22 21:50:02 okd-xwwxf-master-2 audit: PROCTITLE proctitle="/usr/libexec/nm-dispatcher"
Jan 22 21:50:02 okd-xwwxf-master-2 nm-dispatcher[1087]: req:53 'connectivity-change': find-scripts: Failed to open dispatcher directory '/etc/NetworkManager/dispatcher.d': Error opening directory "/etc/NetworkManager/dispatcher.d": Permission denied

Version

IPI with vSphere, 4.11.0-0.okd-2023-01-14-152430 updating to 4.12.0-0.okd-2023-01-21-055900.

How reproducible

100% so far, adding a node works, but with an earlier version of Fedora CoreOS, which will probably get updated in time and fail too.

Log bundle

https://drive.google.com/file/d/16oVumQ6SAHoiP2FlvItbAsIY87CvcW64/view?usp=sharing

vrutkovs commented 1 year ago

MCO operator says

pool is degraded because nodes fail with "1 nodes are reporting degraded status on sync": "Node okd-xwwxf-worker-p8pjt is reporting: \"failed to drain node: okd-xwwxf-worker-p8pjt after 1 hour. Please see machine-config-controller logs for more information\""

MCO controller says:

2023-01-23T09:31:21.543073234Z E0123 09:31:21.543021       1 drain_controller.go:110] error when evicting pods/"rook-ceph-osd-1-c7b8c8b49-8xm58" -n "rook-ceph" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
2023-01-23T09:31:21.543325781Z E0123 09:31:21.543290       1 drain_controller.go:110] error when evicting pods/"rook-ceph-osd-2-56bd8bd885-zbd6k" -n "rook-ceph" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
2023-01-23T09:31:26.544001981Z I0123 09:31:26.543939       1 drain_controller.go:110] evicting pod rook-ceph/rook-ceph-osd-1-c7b8c8b49-8xm58
2023-01-23T09:31:26.544041726Z I0123 09:31:26.543996       1 drain_controller.go:110] evicting pod rook-ceph/rook-ceph-osd-2-56bd8bd885-zbd6k
2023-01-23T09:31:26.544041726Z I0123 09:31:26.544023       1 drain_controller.go:139] node okd-xwwxf-worker-p8pjt: Drain failed. Drain has been failing for more than 10 minutes. Waiting 5 minutes then retrying. Error message from drain: [error when evicting pods/"rook-ceph-osd-1-c7b8c8b49-8xm58" -n "rook-ceph": global timeout reached: 1m30s, error when evicting pods/"rook-ceph-osd-2-56bd8bd885-zbd6k" -n "rook-ceph": global timeout reached: 1m30s]
vrutkovs commented 1 year ago

Checking why okd-xwwxf-master-2 is not coming back from the reboot

vrutkovs commented 1 year ago

in 4.11 -> 4.12 we upgrade from F36 to F37. NM dispatcher on F37 expects scripts to be labelled with system_u:object_r:NetworkManager_dispatcher_script_t:s0, but on F36 files are labelled with system_u:object_r:NetworkManager_initrc_exec_t:s0.

Workaround:

Not sure why MCD/rpm-ostree rebase didn't update the labels. Possibly an rpm-ostree/mco regression? cc @cgwalters

cgwalters commented 1 year ago

boot with selinux=0

No, that's a one-way transition effectively. We want enforcing=0 here.

As far as the incorrect label...hmm, definitely needs some debugging. Does ostree admin config-diff |grep selinux show that you have a modified policy?

Does the restorecon -R -v /etc/NetworkManager/dispatcher.d/ show anything?

rassie commented 1 year ago

@cgwalters

On an upgrading node after OS update restart:

[root@okd-xwwxf-master-1 ~]# rpm-ostree status
State: idle
Deployments:
ā— ostree-unverified-registry:quay.io/openshift/okd-content@sha256:01d90a996a2e78a0486616a0adba0733db428f5a1074976054cedff62f17b2ac
                   Digest: sha256:01d90a996a2e78a0486616a0adba0733db428f5a1074976054cedff62f17b2ac
                  Version: 37.20221225.3.0 (2023-01-23T14:28:02Z)

  pivot://quay.io/openshift/okd-content@sha256:bc4fe370cd76415d045b6cc2cf08e5f696ece912661cfe4370910020be9fe0b6
             CustomOrigin: Managed by machine-config-operator
                  Version: 411.36.202301141513-0 (2023-01-14T15:17:08Z)
[root@okd-xwwxf-master-1 ~]# ostree admin config-diff |grep selinux
M    selinux/targeted/active/commit_num
A    selinux/targeted/semanage.read.LOCK
A    selinux/targeted/semanage.trans.LOCK
[root@okd-xwwxf-master-1 ~]# restorecon -R -v /etc/NetworkManager/dispatcher.d/
Relabeled /etc/NetworkManager/dispatcher.d from system_u:object_r:NetworkManager_initrc_exec_t:s0 to system_u:object_r:NetworkManager_dispatcher_script_t:s0
Relabeled /etc/NetworkManager/dispatcher.d/pre-up.d from system_u:object_r:NetworkManager_initrc_exec_t:s0 to system_u:object_r:NetworkManager_dispatcher_script_t:s0
Relabeled /etc/NetworkManager/dispatcher.d/pre-up.d/10-ofport-request.sh from system_u:object_r:NetworkManager_initrc_exec_t:s0 to system_u:object_r:NetworkManager_dispatcher_script_t:s0
Relabeled /etc/NetworkManager/dispatcher.d/30-resolv-prepender from system_u:object_r:NetworkManager_initrc_exec_t:s0 to system_u:object_r:NetworkManager_dispatcher_script_t:s0
Relabeled /etc/NetworkManager/dispatcher.d/99-vsphere-disable-tx-udp-tnl from system_u:object_r:NetworkManager_initrc_exec_t:s0 to system_u:object_r:NetworkManager_dispatcher_script_t:s0
cgwalters commented 1 year ago

but on F36 files are labelled with system_u:object_r:NetworkManager_initrc_exec_t:s0.

I booted 36.20221030.3.0 and that doesn't seem to be true, I see

[root@cosa-devsh ~]# ls -alZ /etc/NetworkManager/dispatcher.d/
total 4
drwxr-xr-x. 5 root root system_u:object_r:NetworkManager_dispatcher_script_t:s0         111 Nov 11 15:55 .
drwxr-xr-x. 7 root root system_u:object_r:NetworkManager_etc_t:s0                       134 Nov 11 15:55 ..
-rwxr--r--. 1 root root system_u:object_r:NetworkManager_dispatcher_console_script_t:s0 506 Nov 11 15:55 90-console-login-helper-messages-gensnippet_if
drwxr-xr-x. 2 root root system_u:object_r:NetworkManager_dispatcher_script_t:s0           6 Nov 11 15:55 no-wait.d
drwxr-xr-x. 2 root root system_u:object_r:NetworkManager_dispatcher_script_t:s0           6 Nov 11 15:55 pre-down.d
drwxr-xr-x. 2 root root system_u:object_r:NetworkManager_dispatcher_script_t:s0           6 Nov 11 15:55 pre-up.d

In a stock node.

dustymabe commented 1 year ago

Some dispatcher related issues were documented in https://github.com/coreos/fedora-coreos-tracker/issues/1218. Not sure if that's part of the problem here or not.

tthrone-atomic commented 1 year ago

Using the following works for my 4.11 to 4.12 upgrade (vsphere IPI). Did not need to set enforcing=0 on boot restorecon -R -v /etc/NetworkManager/dispatcher.d/

kalik1 commented 1 year ago

Hi, I had same issue. For those who should be in the situation of a blocked update, the workaround at the following url worked in my case: https://github.com/okd-project/okd/issues/1317#issuecomment-1257004454

nate-duke commented 1 year ago

We've recently hit this while updating one of our clusters and are a bit concerned with the impact this has on MachineSet scaling or other "new provisioning" scenarios in existing clusters. Are there any potential workarounds aside from the MachineConfig workaround in https://github.com/okd-project/okd/issues/1317#issuecomment-1257004454?

We've done some limited testing of that workaround and it doesn't appear to work for new systems. What we've seen is that systems will get provisioned but they never make it to a running Node. We're going to do some more testing with this to see what the additional issues are be encountered but given that we're in uncharted territory I'm reluctant to post an issue on an environment that's had a workaround applied to it.

nate-duke commented 1 year ago

Is there anything else we can do to determine the cause of this? It seems to still be impacting new Machine builds in 4.12.0-0.okd-2023-03-05-022504.

There's a FCOS issue mentioned upthread and then there's #1438 and #1450 where it seems selinux is at play as in this issue but there's no clear identification (to me at least) of where the root of the issue is and thus where we can focus for a fix.

Happy to help test in any way that we can.

Bengrunt commented 1 year ago

Hello, I hit the same issue when upgrading a cluster from 4.11 to 4.12 and tried to apply the workaround mentioned here:

Hi, I had same issue. For those who should be in the situation of a blocked update, the workaround at the following url worked in my case: #1317 (comment)

However, I now hit another issue related to OVN with no idea what so ever how to debug this :/

ovsdb-server[1248]: ovs|00002|stream_ssl|ERR|SSL_use_certificate_file: error:80000002:system library::No such file or directory
ovsdb-server[1248]: ovs|00003|stream_ssl|ERR|SSL_use_PrivateKey_file: error:10080002:BIO routines::system lib
ovs-ctl[1202]: Starting ovsdb-server.
[...]
ovs-vswitchd[1328]: ovs|00007|stream_ssl|ERR|SSL_use_certificate_file: error:80000002:system library::No such file or directory
ovs-vswitchd[1328]: ovs|00008|stream_ssl|ERR|SSL_use_PrivateKey_file: error:10080002:BIO routines::system lib
ovs-vswitchd[1328]: ovs|00009|stream_ssl|ERR|failed to load client certificates from /ovn-ca/ca-bundle.crt: error:0A080002:SSL routines::system lib

Then I get no network connectivity on the node. Or maybe OVN fails to boot up because network manager didn't manage to boot up properly but that does not appear in the logs ?

Thanks a lot in advance for any help regarding this.

nate-duke commented 1 year ago

@Bengrunt To be clear, you shelled into the broken node(s) and executed the restorecon spell and then rebooted and were met with the above error in ovs?

(if so you'll probably want to open a new issue and attach a must-gather to get some visibility. also be sure to mention that it's currently an ovn issue!)

Bengrunt commented 1 year ago

@nate-duke Not exactly, what I did was:

  1. Pause MCPs (master and worker) since I had two nodes stuck in the middle of the upgrade.
  2. Create the two machine configs mentioned in this other issue and referred to as a possible workaround above
  3. Reboot the failing node on the previous rpm-ostree release (eg. FCOS 36/OKD 4.11)
  4. Wait for new MCs to be rendered and include the fix
  5. Unpause the MCPs
  6. Hope for the best šŸ¤ž
  7. End up with this new error šŸ˜¢

But maybe you're right I should rather open a new bug, sorry about that.

Bengrunt commented 1 year ago

Hello, just to let other users that would hit this issue that I eventually managed to make the above mentioned workaround work, by running it manually in single mode on the nodes and overriding MCD's validation process.

Thus, I managed to carry out the cluster upgrade process and then run two successive cluster upgrades without any issue.

So I imagine that others with clusters deployed back in 4.6 or 4.7 could work around this issue using the same technique.

Feels like I learned a lot about FCOS and rpm-ostree and the MCO/MCD in the process šŸ˜†

MattPOlson commented 1 year ago

I posted this in the bug referenced above, but this service "rhcos-selinux-policy-upgrade.service" is supposed to be rebuilding the SELinux policy but its not running becuase its trying to use a variable that doesn't exist in fcos.

RHEL_VERSION=$(. /usr/lib/os-release && echo ${RHEL_VERSION:-})
echo -n "RHEL_VERSION=${RHEL_VERSION:-}"

It probably needs to be updated to just Version.

NAME="Fedora Linux"
VERSION="37.20230303.3.0 (CoreOS)" 
alexzose commented 1 year ago

Hello, we experience the same issue with an IPI installation on OpenStack.

The initial cluster version was 4.8, and we have been updating since then.

After the udpate to 4.12, kubelet fails to start because NetworkManager scripts have incorrect SeLinux labels, and the file /run/resolv-prepender-kni-conf-done is never created.

By running restorecon -vR /etc/NetowrkManager/dispatcher.d/ it seems to fix the issue for the kubelet, it starts normally, but then the afterburn-hostname.service fails on boot. Manual restart of afterburn-hostname.service runs OK though.

danielchristianschroeter commented 1 year ago

I ran into the same error situation when upgrading from 4.11.0-0.okd-2023-01-14-152430 to 4.12.0-0.okd-2023-04-16-041331. The incorrect labels result in: May 24 06:45:02 okd-01-zvldl-worker-fxl6k systemd[1]: kubelet.service: Failed with result 'exit-code'. May 24 06:45:02 okd-01-zvldl-worker-fxl6k systemd[1]: Failed to start kubelet.service - Kubernetes Kubelet.

After executing restorecon -vR /etc/NetworkManager/dispatcher.d/;semodule -B;systemctl restart NetworkManager;systemctl restart kubelet it is working temporary.

During the upgrade process if the node switch to "Not Ready" state "restorecon -vR /etc/NetworkManager/dispatcher.d/;semodule -B" was enough to continue the upgrade process. At the beginning the labels will be reset so executing this command before on every node is not working.

nate-duke commented 1 year ago

So, we're still dealing with this on every new node provision (and nearly if not every update?). Is there a recommended place we can file an issue to get this fixed in FCOS as mentioned in https://github.com/okd-project/okd/issues/1475#issuecomment-1513272933?

LorbusChris commented 1 year ago

ah yes. Please file an issue on https://github.com/openshift/os/

Something like: "rhcos-selinux-policy-upgrade.service broken on OKD"

Please also include a link to this issue here.

JaimeMagiera commented 3 months ago

Hi,

We are not working on FCOS builds of OKD any more. Please see these documents...

https://okd.io/blog/2024/06/01/okd-future-statement https://okd.io/blog/2024/07/30/okd-pre-release-testing

Please test with the OKD SCOS nightlies and file a new issue as needed.

Many thanks,

Jaime