opensvc / multipath-tools

Other
59 stars 47 forks source link

systemd-udev-settle.service is deprecated. Please fix multipathd.service not to pull it in. #3

Closed llebout closed 2 years ago

llebout commented 3 years ago
$ systemctl status systemd-udev-settle.service
● systemd-udev-settle.service - Wait for udev To Complete Device Initialization
     Loaded: loaded (/usr/lib/systemd/system/systemd-udev-settle.service; static)
     Active: failed (Result: exit-code) since Sat 2021-02-06 14:22:23 CET; 7min ago
       Docs: man:systemd-udev-settle.service(8)
    Process: 2615 ExecStart=/usr/sbin/udevadm settle (code=exited, status=1/FAILURE)
   Main PID: 2615 (code=exited, status=1/FAILURE)
        CPU: 1.229s

Feb 06 14:20:23 talos systemd[1]: Starting Wait for udev To Complete Device Initialization...
Feb 06 14:20:24 talos udevadm[2615]: systemd-udev-settle.service is deprecated. Please fix multipathd.service not to pull it in.
Feb 06 14:22:23 talos systemd[1]: systemd-udev-settle.service: Main process exited, code=exited, status=1/FAILURE
Feb 06 14:22:23 talos systemd[1]: systemd-udev-settle.service: Failed with result 'exit-code'.
Feb 06 14:22:23 talos systemd[1]: Failed to start Wait for udev To Complete Device Initialization.
Feb 06 14:22:23 talos systemd[1]: systemd-udev-settle.service: Consumed 1.229s CPU time.

Should you follow systemd's advice here?

mwilck commented 3 years ago

Should you follow systemd's advice here?

At the moment, we can't. Waiting for udev settle was introduced for strong reasons. Changing that is on my todo list, but it's a major effort, because we need to change the way multipathd discovers and tracks devices in fundamental ways.

The problem is that when we leave the initrd and enter the root FS, device-mapper devices persist but low-level devices such as SCSI disks do not. They first have to be re-discovered by coldplug (`systemd-udev-trigger.service"). When multipathd starts before "udev settle" is finished, it will encounter multipath maps referencing devices that apparently don't exist. multipathd will assume that these maps are invalid, and will try to tear them down, with possibly fatal consequences for the system. It's possible to change the behavior of multipathd by using sysfs-based device detection, but experience any device-detection-related change tends to have and partly unforeseen side effects. This needs careful engineering and even more careful testing, and will take time.

I'm not sure what this means in your log:

Feb 06 14:22:23 talos systemd[1]: systemd-udev-settle.service: Main process exited, code=exited, status=1/FAILURE

At first look I thought udevadm settle wouldn't work at all any more. But AFAICS that's not (yet) the case, and this failure is unrelated to the depreciation warning. Or am I overlooking something? What systemd version are you using?

Anyway, I suppose the systemd people are serious about this, and we need to tackle the basic issue rather sooner than later.

llebout commented 3 years ago

@mwilck The error is caused by some hardware/firmware bug of mine I think. Unrelated. Basically I run a Talos II machine and right now some PCI-e device is locked in an unusual way, lspci hangs and that device is not seen by the Linux kernel at all. When I give my machine a full reboot it goes away.

mwilck commented 3 years ago

Thanks for clarifying that.

benmarz commented 2 years ago

A patchset for removing this dependency has been posted here: https://listman.redhat.com/archives/dm-devel/2021-October/msg00321.html