quattor / ncm-lib-blockdevices

Node Configuration Manager Library for Block Devices and File Systems
www.quattor.org
Other
4 stars 12 forks source link

Pausing udev makes `/dev/disk/by-*` unusable for installations #105

Open jrha opened 9 months ago

jrha commented 9 months ago

The change introduced back in 55401db796a6ef5b207b39c837384834c963805c pauses udev until partition metadata is wiped, but crucially this does not happen until the partition exists (i.e. test -e PARTITION passes).

On hosts with complex storage topologies the udev managed symlinks under /dev/disk/by-path are commonly used to identify the installation device.

For example, installing to the disk /dev/disk/by-path/pci-0000:c3:00.0-ata-1 will hang forever waiting for the first partition /dev/disk/by-path/pci-0000:c3:00.0-ata-1-part1 to exist as the generated code in the kickstart file does the following:

udevadm control --stop-exec-queue
parted /dev/disk/by-path/pci-0000:c3:00.0-ata-1 -s -- u s mkpart primary $begin $end
while true; do
    sleep 1
    udevadm settle --timeout=5
    test -e /dev/disk/by-path/pci-0000:c3:00.0-ata-1-part1 && break
done
wipe_metadata /dev/disk/by-path/pci-0000:c3:00.0-ata-1-part1
udevadm control --start-exec-queue
udevadm settle

Side note: I'm not sure if calling udevadm settle while the queue is stopped is even meaningful.

It's not clear to me how to handle this without breaking the original use-case here, as starting the queue earlier will cause the same LVM related issues as originally described.

jrha commented 9 months ago

Could we trigger just /usr/lib/udev/rules.d/60-persistent-storage.rules?

jrha commented 9 months ago

Alternatively, instead of pausing udev, could we mask the device-mapper related rules?

touch /etc/udev/rules.d/{10-dm.rules,11-dm-parts.rules,13-dm-disk.rules}
…
rm -f /etc/udev/rules.d/{10-dm.rules,11-dm-parts.rules,13-dm-disk.rules}
jrha commented 9 months ago

Tested potential solution in #106.