OSInside / kiwi

KIWI - Appliance Builder Next Generation
https://osinside.github.io/kiwi
GNU General Public License v3.0
283 stars 144 forks source link

Wait for udev event queue to become empty #2516

Closed schaefi closed 3 months ago

schaefi commented 3 months ago

Make sure to wait for the event queue to become empty prior accessing the device nodes

schaefi commented 3 months ago

There seems to be a race condition in the live iso code for which this change could be a solution. However, it's not clear yet and needs proper testing in the case of the race

Vogtinator commented 3 months ago

This should be a noop as the script is only called from the initqueue --settled hook

schaefi commented 3 months ago

This should be a noop as the script is only called from the initqueue --settled hook

Yes I agree it doesn't make sense... in theory everything should have already settled

schaefi commented 3 months ago

I changed the code to wait for the event queue after an eventual partition creation. I think this could really cause the race condition. When we add the write partition and trigger the re-reading of the table this will queue new events for udev. Waiting for them to settle is imho important. I'll check the builds with this patch.

schaefi commented 3 months ago

I chose the virtio interface for testing. It's fast and triggered the race condition more often. So far this one worked for

localhost:~ # lsblk 
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
fd0      2:0    1     4K  0 disk 
loop0    7:0    0 257.7M  1 loop /run/overlay/squashfs_container
loop1    7:1    0   1.4G  1 loop /run/overlay/rootfsbase
sr0     11:0    1  1024M  0 rom  
vda    253:0    0    20G  0 disk 
├─vda1 253:1    0 335.7M  0 part /run/overlay/live
├─vda2 253:2    0    20M  0 part 
└─vda3 253:3    0  19.7G  0 part /run/overlay/overlayfs
localhost:~ # df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        4.0M     0  4.0M   0% /dev
tmpfs           2.0G     0  2.0G   0% /dev/shm
tmpfs           786M   17M  769M   3% /run
tmpfs           4.0M     0  4.0M   0% /sys/fs/cgroup
tmpfs           2.0G     0  2.0G   0% /run/overlay
/dev/vda3        20G   30M   19G   1% /run/overlay/overlayfs
/dev/vda1       336M  336M     0 100% /run/overlay/live
/dev/loop0      258M  258M     0 100% /run/overlay/squashfs_container
/dev/loop1      1.4G  979M  274M  79% /run/overlay/rootfsbase
LiveOS_rootfs    20G   30M   19G   1% /
tmpfs           393M     0  393M   0% /run/user/0
schaefi commented 3 months ago

Can someone else double check on it ? Thanks much

Vogtinator commented 3 months ago

I'm trying to build 15.6 test isos ATM

aafeijoo-suse commented 3 months ago

I changed the code to wait for the event queue after an eventual partition creation.

If my feedback can be of some use, the first approach of this PR didn't work, but this last change worked for me (I tried it twice).

Vogtinator commented 3 months ago

The bug https://bugzilla.opensuse.org/show_bug.cgi?id=1219074 just arrived in 15.x, breaking live cds even more :-/

Vogtinator commented 3 months ago

I backported 69cc1c35d77ca56b930f1a0d79f8c19b9d43e754 as well as this PR to 15.6 and the resulting rescue-cd image booted successfully 3/3 times. So it's most likely fixed!

schaefi commented 3 months ago

Yay, thanks much for testing :+1: Also all my tests worked.

schaefi commented 3 months ago

@Vogtinator I rebased and provided more details in the commit message. I'd like to add the bugzilla reference connected with my fix but can't remember which bug it was. Can you give me a hint ? Thanks

Vogtinator commented 3 months ago

@Vogtinator I rebased and provided more details in the commit message. I'd like to add the bugzilla reference connected with my fix but can't remember which bug it was. Can you give me a hint ? Thanks

https://bugzilla.opensuse.org/show_bug.cgi?id=1213595