LINBIT / drbd-utils

DRBD userspace utilities (for 9.x, 8.4, 8.3)
GNU General Public License v2.0
78 stars 46 forks source link

udev links break with newer systemd versions #44

Open atkinsam opened 6 months ago

atkinsam commented 6 months ago

DRBD's udev rules don't work with newer systemd versions when the machine's hostname changes during boot, as it does in many VMs and cloud hosts. Starting with systemd version 242, the new ProtectHostname sandbox setting is enabled by default in the systemd-udevd service, applying to any scripts run within udev rules (see systemd-udevd unit file and changelog). This is impacting DRBD's udev links: https://github.com/LINBIT/drbd-utils/blob/1caa04161811e085c7a1c4847c07eb663cf85e3f/scripts/drbd.rules.in#L6

When the udev rule executes, drbdadm sees the host's original (invalid) hostname, compares it to the hostname in the resource config, and fails:

(udev-worker)[93198]: drbd0: Starting '/usr/local/sbin/drbdadm sh-udev minor-0'
(udev-worker)[93198]: drbd0: '/usr/local/sbin/drbdadm sh-udev minor-0'(out) ''minor-0' not defined in your config (for this host).'
(udev-worker)[93198]: drbd0: Process '/usr/local/sbin/drbdadm sh-udev minor-0' failed with exit code 1.

And the /dev/drbd/by-disk and /dev/drbd/by-res symlinks never get created.

I'm not sure what the right solution to this is. As far as I can tell, there is no way to scope these sandbox rules to individual udev rules or scripts. Disabling ProtectHostname on systemd-udevd via a systemd override solves this problem. Restarting the systemd-udevd service also solves this problem, as the hostname in the UTS namespace is refreshed. Neither of those seem like tenable solutions to me.

raltnoeder commented 6 months ago

Or rather, newer systemd versions don't work with DRBD's udev rules, because they invented new dirty magic tricks to avoid the fallout from slightly older dirty magic tricks invented by the VM and cloud crowd. I am getting kind of used to things constantly breaking as soon as virtualization, cloud, systemd or udev are involved, but still, fixing it will probably take a while, because we might have to set up an environment where we can recreate the breakage to check whether any workarounds that we can come up with will work.

WanzenBug commented 5 months ago

We are working on resolving the issue by changing the udev rules to use a new drbdsetup command. This is actually a much cleaner solution in any case, as that takes the actual configuration in the kernel, compared to stuff in the config files.

There is less chance of breakage caused by invalid config files, mismatched configuration or, as with this issue, mismatched hostnames.