Closed davidcassany closed 1 year ago
systemd-udevd[601]: sda5: Failed to add device '/dev/sda5' to watch: Operation not permitted
What's the root cause of this failure ?
I have no clue... 🤷🏽♂️ It happened in only one node out of 50, so I believe it has to be a race condition somewhere. We can attempt to be more strict on mounting all the special paths at boot and wait for the devices if they are not there yet.
This is also strange
Extending last partition up to 0 MiB
Yes, the message here is misleading. There is a the expansion of the COS_PERSISTENT partition enabled by default (which I doubt is actually valuable for our current workflow). If size is set to 0
it expands to all available size, we did not properly cover this case in logging. Also, it did not expand as the partition already makes use of all available space.
I have been wondering about if the attempt of expanding the partition could cause some sort of collision with udev at boot. So wondering if at the time parted
accessed the partition (even without applying any change) that could cause some glitch with udev daemon.
I had the issue in a test: rancher-system-agent
service failed because binary is not found in /opt
as this filesystem is in read-only state.
I have this in /etc/fstab
:
/dev/loop0 / auto ro 0 0
tmpfs /run/overlay tmpfs defaults,size=25% 0 0
overlay /etc overlay defaults,lowerdir=/etc,upperdir=/run/overlay/etc.overlay/upper,workdir=/run/overlay/etc.overlay/work,x-systemd.requires-mounts-for=/run/overlay
/dev/disk/by-label/COS_OEM /oem auto defaults 0 0
overlay /srv overlay defaults,lowerdir=/srv,upperdir=/run/overlay/srv.overlay/upper,workdir=/run/overlay/srv.overlay/work,x-systemd.requires-mounts-for=/run/overlay
/dev/disk/by-label/COS_PERSISTENT /usr/local auto defaults 0 0
overlay /var overlay defaults,lowerdir=/var,upperdir=/run/overlay/var.overlay/upper,workdir=/run/overlay/var.overlay/work,x-systemd.requires-mounts-for=/run/overlay
Elemental packages:
# rpm -qa | grep elemental
elemental-dracut-config-0.10.3+git20230315.63e8c41-150400.94.1.noarch
elemental-grub-config-0.10.3+git20230315.63e8c41-150400.94.1.noarch
elemental-immutable-rootfs-0.10.3+git20230315.63e8c41-150400.94.1.noarch
elemental-register-1.1.4+git20230315.63c99d7-150400.115.1.x86_64
elemental-support-1.1.4+git20230315.63c99d7-150400.115.1.x86_64
elemental-system-agent-0.3.1-150400.2.2.x86_64
elemental-updater-1.1.2+git20230315.f7048f1-150400.212.1.noarch
elemental-cli-0.2.2+git20230310.c214054-150400.72.2.x86_64
elemental-init-setup-0.10.3+git20230315.63e8c41-150400.94.1.noarch
elemental-init-services-0.10.3+git20230315.63e8c41-150400.94.1.noarch
elemental-init-recovery-0.10.3+git20230315.63e8c41-150400.94.1.noarch
elemental-init-network-0.10.3+git20230315.63e8c41-150400.94.1.noarch
elemental-init-live-0.10.3+git20230315.63e8c41-150400.94.1.noarch
elemental-init-boot-assessment-0.10.3+git20230315.63e8c41-150400.94.1.noarch
elemental-init-config-0.10.3+git20230315.63e8c41-150400.94.1.noarch
elemental-toolkit-0.10.3+git20230315.63e8c41-150400.94.1.noarch
elemental-1.1.2+git20230315.f7048f1-150400.212.1.noarch
Here the full journalctl log: boot.log.gz
A reboot of the node fixed the issue, so maybe an issue when the /etc/fstab
file is generating during the boot?
This looks like a missing udevadm settle
🤔
Mar 15 21:15:51 localhost elemental[784]: INFO[2023-03-15T21:15:51Z] Extending last partition up to 0 MiB
Mar 15 21:15:51 localhost systemd-udevd[617]: sda5: Failed to add device '/dev/sda5' to watch: Operation not permitted
And here's the first error about a read-only filesystem
Mar 15 21:15:52 localhost elemental[1039]: INFO[2023-03-15T21:15:52Z] Executing /system/oem/01_elemental-rootfs.yaml
Mar 15 21:15:52 localhost elemental[1039]: INFO[2023-03-15T21:15:52Z] Applying 'Elemental Rootfs Layout Settings' for stage 'initramfs'. Total stages: 3
Mar 15 21:15:52 localhost elemental[1039]: INFO[2023-03-15T21:15:52Z] Processing stage step ''. ( commands: 1, files: 0, ... )
Mar 15 21:15:52 localhost elemental[1039]: INFO[2023-03-15T21:15:52Z] Command output: mkdir: cannot create directory ‘/usr/local/etc’: Read-only file system
Mar 15 21:15:52 localhost elemental[1039]: sh: line 2: /usr/local/etc/hostname: No such file or directory
@ldevulder is that Dev, Staging or Stable ?
Mar 15 21:15:51 localhost systemd-fsck[978]: Failed to stat /dev/disk/by-label/COS_PERSISTENT: No such file or directory
😮
I'd say we have few of things to consider here, all actionable:
@ldevulder is that Dev, Staging or Stable ?
It's on Dev, Staging should be the same as it seems to have happened before the last sync between Dev and Staging. Not tested with Stable.
In rancher/elemental-toolkit#1743 instead of explicitly erroring out in case there is some error to prepare or mount a mountpoint and just changed to logic to not ignore it an still include the mountpoint in fstab. The idea is that we keep creating the fstab as requested but do not fail if we can't mount everything, systemd will later on fail if it can't manage to mount all fstab entries.
Moved to needs review as verifying the fix requires some tests running big clusters (the issue turned to be more or less consistent with clusters with more the 100 nodes).
To fully verify the fix we should see in some node boot logs that there was the following udev error message:
systemd-udevd[601]: sda5: Failed to add device '/dev/sda5' to watch: Operation not permitted
and despite this, the node successfully booted and joined the cluster. I guess that means parsing the boot logs of every single node.
Apparently the fix was not enough... 😞 looks like there might be some sort of issue with udev. With the logs of a failed node including the fix we noticed a slightly different behavior. The Operation not permitted
on device /dev/sda5
was not there, however cos-mount-layout.sh still failed properly prepare the layout.
Mar 20 10:23:05 localhost systemd[1]: Starting cOS system immutable rootfs mounts...
Mar 20 10:23:05 localhost systemd[1]: Unmounting /oem...
Mar 20 10:23:05 localhost systemd[1]: initrd-parse-etc.service: Deactivated successfully.
Mar 20 10:23:05 localhost systemd[1]: Finished Reload Configuration from the Real Root.
Mar 20 10:23:05 localhost systemd[1]: Starting dracut mount hook...
Mar 20 10:23:05 localhost systemd[1]: oem.mount: Deactivated successfully.
Mar 20 10:23:05 localhost systemd[1]: Unmounted /oem.
Mar 20 10:23:05 localhost systemd[1]: Finished dracut mount hook.
Mar 20 10:23:05 localhost kernel: EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Mar 20 10:24:06 localhost systemd-fsck[1096]: Failed to stat /dev/disk/by-label/COS_PERSISTENT: No such file or directory
Mar 20 10:24:06 localhost cos-mount-layout[1097]: Warning: /dev/disk/by-label/COS_PERSISTENT already mounted or device not found
Mar 20 10:24:06 localhost cos-mount-layout[1107]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1105]: failed creating '/sysroot/etc/systemd' or '/sysroot/usr/local/.state/etc-systemd.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1110]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1108]: failed creating '/sysroot/etc/rancher' or '/sysroot/usr/local/.state/etc-rancher.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1113]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1111]: failed creating '/sysroot/etc/ssh' or '/sysroot/usr/local/.state/etc-ssh.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1116]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1114]: failed creating '/sysroot/etc/iscsi' or '/sysroot/usr/local/.state/etc-iscsi.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1119]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1117]: failed creating '/sysroot/etc/cni' or '/sysroot/usr/local/.state/etc-cni.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1122]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1120]: failed creating '/sysroot/home' or '/sysroot/usr/local/.state/home.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1125]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1123]: failed creating '/sysroot/opt' or '/sysroot/usr/local/.state/opt.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1128]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1126]: failed creating '/sysroot/root' or '/sysroot/usr/local/.state/root.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1131]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1129]: failed creating '/sysroot/usr/libexec' or '/sysroot/usr/local/.state/usr-libexec.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1134]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1132]: failed creating '/sysroot/var/log' or '/sysroot/usr/local/.state/var-log.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1137]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1135]: failed creating '/sysroot/var/lib/elemental' or '/sysroot/usr/local/.state/var-lib-elemental.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1140]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1138]: failed creating '/sysroot/var/lib/rancher' or '/sysroot/usr/local/.state/var-lib-rancher.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1143]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1141]: failed creating '/sysroot/var/lib/kubelet' or '/sysroot/usr/local/.state/var-lib-kubelet.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1146]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1144]: failed creating '/sysroot/var/lib/NetworkManager' or '/sysroot/usr/local/.state/var-lib-NetworkManager.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1149]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1147]: failed creating '/sysroot/var/lib/longhorn' or '/sysroot/usr/local/.state/var-lib-longhorn.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1152]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1150]: failed creating '/sysroot/var/lib/cni' or '/sysroot/usr/local/.state/var-lib-cni.bind'
Mar 20 10:24:06 localhost cos-mount-layout[1155]: mkdir: cannot create directory '/sysroot/usr/local/.state': Read-only file system
Mar 20 10:24:06 localhost cos-mount-layout[1153]: failed creating '/sysroot/var/lib/calico' or '/sysroot/usr/local/.state/var-lib-calico.bind'
Mar 20 10:24:06 localhost systemd[1]: Finished cOS system immutable rootfs mounts.
See the logs cos-mount-layout.sh does not succeed to prepare the mountpoints due to the missing device /dev/disk/by-label/COS_PERSISTENT
, it even waits for it during 1min having a udevadm settle
call on every second. The interesting fact is that later on when switching root the device is found properly and all mountpoints are created according to the generated /etc/fstab
. Rebooting fixes the issue as before.
Time to raise a kernel/systemd bug via Bugzilla ?! 🤔
We already had a couple of successful big deployments with 150 nodes and more. Closing as fixed, both successful runs were including https://github.com/rancher/elemental-toolkit/pull/1744
What steps did you take and what happened:
The issue raised on a 50 nodes deployment on the CI. There was one node that did not join the cluster. This happened because the immutable-rootfs layout failed to properly mount the persistent partition and all of the state RW paths depending on it. Once booted the
bootstrap.sh
script failed to install RKE2 as /opt was not writable.What did you expect to happen: The node boostraps RKE2 and joins the cluster. Probably we should just fail at boot time, immutable-rootfs simply omitted to create the persistent mountpoints and continued to boot normally, probably better to include all them in /etc/fstab and let systemd fail on later stages if the underlaying devices and mountpoints do not appear.
Worth to mention that on boot
/usr/local
got mounted based on/etc/fstab
contents after switching root when systemd took control of the new root/
.Anything else you would like to add:
Journalctl logs of the initrd boot stage contain most of the relevant info:
Environment:
cat /etc/os-release
): 1.1.1+git20230311.9525ad7kubectl version
):