Closed splate07 closed 1 year ago
Is this image burned to the usb drive? I'm trying to make sense of the report.
no, it is not this iso file is simply placed in the root directory of the 5th partition of my hard drive.
I have this entry in my grub.cfg file
menuentry "void amd64 2022 xfce" { set isofile="/void-live-x86_64-20221001-xfce.iso" loopback loop (hd1,msdos5)$isofile linux (loop)/boot/vmlinuz iso-scan/filename=$isofile root=live:CDLABEL=VOID_LIVE ro init=/sbin/init rd.luks=0 rd.md=0 rd.dm=0 rd.live.overlay.overlayfs=1 initrd (loop)/boot/initrd }
that's it. It works for 2021 image, it doesn't work for 2022 image
I boot the isos fine via grub (with https://github.com/classabbyamp/glim). does the latest iso boot fine directly? this may be an issue with your computer
I boot the isos fine via grub (with https://github.com/classabbyamp/glim). does the latest iso boot fine directly? this may be an issue with your computer
good for you now try to boot void-live-x86_64-20221001-xfce.iso using https://github.com/classabbyamp/glim and see for yourself btw, my menuetry is based on https://github.com/classabbyamp/glim/blob/master/grub2/inc-void.cfg
ok I am able to reproduce now, but it's very odd:
I'm also having troubles with this. booting the latest ISO through a grub2 menu entry doesn't work on my machine. I also tried thias' GLIM with no success. Both on bare metal and with qemu. I even tried qemu with an EFI file, no dice. dding the image to the same usb boots fine on bare metal, qemu and qemu with an EFI file (it goes to grub instead of syslinux). my uneducated guess is that it could be this dracut commit: https://github.com/void-linux/void-packages/commit/eef5529636d2672b514cba53e604fb6f5db9f99e https://github.com/dracutdevs/dracut/commit/87c4c17850e8bb982f6c07a6d3f58124bb2875de and a relevant issue in void-packages: https://github.com/void-linux/void-packages/issues/38367
20210930 iso (boots using a grub2 menu entry): dracut 53_2 kmod 27_3
20221001 iso (doesn't boot using a grub2 menu entry): dracut 53_4 kmod 30_1
It's not dracut.
With rd.debug
in the kcl, I took the rdsosreport.txt
boot log generated by dracut for both the last working image and the first broken image (also adding rd.break
to get a shell for the former).
In a diff between the two logs, 9 lines stand out in informational output before dracut starts searching for/mounting the root:
(in cat /proc/self/mountinfo
output)
-29 26 7:0 / /run/initramfs/live ro,relatime - iso9660 /dev/loop0 ro,nojoliet,check=s,map=n,blocksize=2048,iocharset=utf8
-31 1 254:0 / /sysroot rw,relatime - ext3 /dev/mapper/live-rw rw
(in cat /proc/mounts
output)
-/dev/loop0 /run/initramfs/live iso9660 ro,relatime,nojoliet,check=s,map=n,blocksize=2048,iocharset=utf8 0 0
-/dev/mapper/live-rw /sysroot ext3 rw,relatime 0 0
(in blkid
output)
-/dev/loop0: BLOCK_SIZE="2048" UUID="2021-10-07-00-22-44-00" LABEL="VOID_LIVE" TYPE="iso9660" PTUUID="4e4d61a4" PTTYPE="dos"
-/dev/loop1: TYPE="squashfs"
-/dev/mapper/live-base: UUID="65732de4-1bfe-479b-8269-be87b1fb8c8e" SEC_TYPE="ext2" BLOCK_SIZE="4096" TYPE="ext3"
-/dev/loop2: UUID="65732de4-1bfe-479b-8269-be87b1fb8c8e" SEC_TYPE="ext2" BLOCK_SIZE="4096" TYPE="ext3"
-/dev/mapper/live-rw: UUID="65732de4-1bfe-479b-8269-be87b1fb8c8e" BLOCK_SIZE="4096" TYPE="ext3"
The loop devices are not even present when booting a newer image, which points to the kernel, which is also the only relevant package that had updates between the last working and first broken images (5.13.19_1
and 5.19.10_1
respectively).
This is confirmed when booting a freshly built image made with mklive's -v linux5.13
argument.
Now that I know it's the kernel, I'll try to narrow down what version between 5.13 and 5.19 broke this.
It's linux 5.19. The last version that boots properly from loopback is 5.18.
There doesn't seem to be relevant changes in the dotconfigs of those two versions, nor in the patches.
Some wild guesses:
Could it be a missing kernel module for the storage - e.g. mmc - https://www.reddit.com/r/voidlinux/comments/y03b8b/baytrail_stopped_booting_after_updating_to_519/
Is the loop module loaded/available ? Perhaps a missing "modprobe loop" somewhere ?
Could it be a missing kernel module for the storage - e.g. mmc - https://www.reddit.com/r/voidlinux/comments/y03b8b/baytrail_stopped_booting_after_updating_to_519/
In my case at least, tests were done on a standard desktop computer, and the storage holding both the GLIM setup and the ISO image is a plain FAT32 partition (part type ID 0c
, fs created with mkfs.vfat
) on a normal USB FLASH drive (/dev/sdX
) with MBR partition table.
It seems to me like all modules possibly involved in that are already loaded, and furthermore, at the time dracut drops me to a shell, the partition on the USB drive is indeed already mounted at /run/initramfs/isoscan
and its contents are present in that directory as expected.
I'll also be setting up a testbench in QEMU to do further tests.
Is the loop module loaded/available ? Perhaps a missing "modprobe loop" somewhere ?
At the dracut debug shell, loop
in indeed present in /proc/modules
. However, none of cdrom
, isofs
, and squashfs
are present in that list at that point.
These lines are present in dmesg logs of both working (5.18 and before) and failing (5.19+) images:
loop: module loaded
dracut: root was live:CDLABEL=VOID_LIVE, is now live:/dev/disk/by-label/VOID_LIVE
Manually mounting the ISO image correctly loads both cdrom
and isofs
, and the contents of the image are present at the mountpoint as expected. Further mounting the squashfs image also properly loads squashfs
(again, contents present as expected).
However, after both mount operations, there isn't any info on the mounted filesystems in lsblk -f
(fstype, fsver, Label!, UUID) ...until udevadm trigger
is manually run. This kernel commit seems potentially relevant as is would introduce delays before being able to see the label after mounting the ISO https://github.com/torvalds/linux/commit/498ef5c777d9c89693b70cc453b40c392120ea1b. I will be testing if adding a delay after the mount in dracut fixes the issue.
If you have further insights, I'll test those too
Note: if using a fedora image for testing, the kernel and initrd location in the ISO seem to have changed since the last time dracut.cmdline was modified. They are now present in /images/pxeboot/
.
@LaszloGombos
One thing I forgot to mention in the previous message, is that the ISO image does (sometimes?*) stay mounted when dropping in the shell, but in the mounted-with-no-label state.
Since last message, I've also found a kinda-fix:
From a boot attempt where the mount was already present while in the shell, simply running udevadm trigger
and leaving the shell lead to dracut successfully booting into the Void live image.
*kinda confused as to how it shows as mounted in some attempts while I recall other attempts not even having the loop module loaded in the shell (the issue is either a separate one in dracut/the images, or in my recollection of all of these attempts at booting)
@0x5c
Please try to autoload modules that this use case needs from the bootloader command line arguments - e.g. "rd.driver.pre= loop,cdrom,isofs, squashfs"
You could also try this patch that debian carries: https://salsa.debian.org/debian/dracut/-/blob/master/debian/patches/udevsettle
Fedora bug report - https://bugzilla.redhat.com/show_bug.cgi?id=2131852
A fix for this has been merged upstream https://github.com/dracutdevs/dracut/pull/2196, and there's a backport of it to the package https://github.com/void-linux/void-packages/pull/42265 Once that's merged any new image shouldn't have problems with iso-scan anymore.
I tried running the latest x86_64 live image (void-live-x86_64-20221001-xfce.iso) on real hardware via grub2 boot manager and it is broken. The root device cannot be found for some reason, and so the user is dropped into a debug shell. You can't reproduce the same issue with the previous version of the image (void-live-x86_64-20210930-xfce.iso).