opensvc / multipath-tools

Other
60 stars 48 forks source link

multipathd started leads to initrd blocked. #65

Closed lixiaokeng closed 1 year ago

lixiaokeng commented 1 year ago

With 11f0440 commit, we meet a problem.

When there is a multipath device in initrd, multipathd will be actived becasue of dracut/modules.d/90multipath/multipathd-needshutdown.sh running. This lead to initrd-cleanup.service can't be run and switch root fails. The initrd is blocked.

https://github.com/dracutdevs/dracut/issues/2289

mwilck commented 1 year ago

I think the entire multipathd-needshutdown.sh logic is obsoleted by our queue_without_daemon feature. multipathd itself will take care that queuing is disabled when it quits.

@bmarzins, please double-check, I tend to overlook things lately ...

mwilck commented 1 year ago

If we still need multipathd-needshutdown.sh and multipathd-shutdown.sh (as noted above, I think we don't), we should call multipath with the -D option in these files.

mwilck commented 1 year ago

@lixiaokeng, please try if removing the line

Also=multipathd.socket

in multipathd.service fixes the issue.

lixiaokeng commented 1 year ago

We test this (https://github.com/dracutdevs/dracut/pull/2290/files),and it fix this issue.

bmarzins commented 1 year ago

I'm on the fence about the best way to solve this:

In the end, I think I vote for doing the last two. Disabling the socket should be enough for the initramfs, but there's no point in multipathd trying to delegate work if we don't want it delegated. As for the first option, I may be letting my systemd shutdown ignorance show, but couldn't this also interfere with actual system shutdown. Accoring to the dracut-shutdown service man page: shutdown will try to umount every /oldroot mount and calls the various shutdown hooks from the dracut modules. If we are currently making certain that we disable queuing when we shut the system down, even if multipathd isn't currently running, then I feel we should probably continue to do so, just out of caution. You can override queue_without_daemon, and it's perfectly reasonable to do so in some cases, like if you are trying to update the multipath-tools binaries. If multipathd failed to restart after this, queuing would be left enabled.

mwilck commented 1 year ago

My previous comment about -D was premature. I double-checked, and the two dracut scripts in question just run multipath -l, which currently does't attempt to delegate the command, or access the socket. As noted in https://github.com/dracutdevs/dracut/pull/2290#issuecomment-1483939090, I don't understand how socket activation occured in the error case. It could have been multipath -u run from udev rules[^1], maybe, but does that run during shutdown?

You can override queue_without_daemon, and it's perfectly reasonable to do so in some cases, like if you are trying to update the multipath-tools binaries. If multipathd failed to restart after this, queuing would be left enabled.

OK. According to @lixiaokeng's comment above, disabling the socket in the initrd is sufficient to fix the issue. So we can leave the shutdown scripts untouched [^2].

[^1]: which doesn't care about -D [^2]: Although one might argue that turning off queue_without_daemon and then shutting down with no paths left is a corner case that qualifies as "shooting oneself in the foot", and deserves having to hit the big red button.

lixiaokeng commented 1 year ago

close