fosslinux / live-bootstrap

Use of a Linux initramfs to fully automate the bootstrapping process
286 stars 26 forks source link

Bootstrap takes an extreamly long time when running under QEMU. #430

Closed ajherchenroder closed 4 months ago

ajherchenroder commented 5 months ago

Because of work commitments, I haven been able to touch this project since the new year. When I tried to run the bootstrap with the latest version it has been running for 5 days and it still hasn't completed. At first I thought it froze but that doesn't appear to be the case. If I come back every few hours i do see slow progress. I tested my personal fork which was pulled on 28 December 2023 on the same machine and it completed in several hours. After doing some testing, the problem appears to be FIWIX. As soon at the kexec-fiwix happens the bootstrap slows to a crawl. It behaves like it's waiting for some IO or memory operation. The bootstrap works normally using Bwrap and chroot. My command line is :

./rootfs.py -q --update-checksums --cores=4 -qr=4096 -i -s=4096

I have my files predownloaded using download-distfiles.sh. Can someone please see if they can duplicate this issue or if I'm doing something wrong please correct me.

Googulator commented 5 months ago

Predownloading doesn't really matter if you don't also pass --external-sources to rootfs.py.

That said, 5 days in Fiwix is certainly abnormal; also, --external-sources doesn't really do anything until after the Linux kernel is built and started.

To debug this better, it would be important to know:

(Storage performance should have no effect in the Fiwix phase, as in Fiwix, everything happens in RAM, with no access to drives.)

Another thing you may try is making a bare metal image (using -b instead of -q), and then booting it using a normal qemu VM (without the -nographic option, so you have a real graphical console) on the same hardware. When combined with the -i option (which you are already using), in the latest code, that lets you access secondary virtual consoles as soon as the bootstrap transitions from kaem to bash as its shell. To access the secondary console, use the key combination Ctrl+Shift+F2 - in qemu, you might need to use the "sendkeys" command from the monitor console to do this, in case your host OS traps that combination.

ajherchenroder commented 5 months ago

I'm running the following:

Arch Linux

I run libvirt via virt manager

KVM is enabled.

it's an Intel NUC 11 i5 with an i5-1135G7 processor

32 gigs DDR4-3200 in 2 channels

ajherchenroder commented 5 months ago

After building commit by commit I have identified the origin of my issue. It is from January 7th. specifically "Upgrade Fiwix to 1.5.0-lb1, pulling from upstream Mikaku repo. (https://github.com/fosslinux/live-bootstrap/pull/397)" which is commit number 1bffe44. Anything built before that commit is building at full speed under QEMU. I'm guessing that something is miss-configured or there is a bug in that version but i don't understand it well enough to tell what's wrong. Can someone please see if they can replicate my findings?

fosslinux commented 5 months ago

How much RAM do you have assigned to the VM? Could you send through the entire libvirt domain XML?

ajherchenroder commented 5 months ago

Just to be clear I am having issues with the basic -q built in direct QEMU emulation. I have tried the bare metal config in virt-manager but I have console issues running it that way. In both cases I am using 4 Gigs of ram. A friend of mine suggested that it might be a problem with the configuration of the Arch kernel. I am going to set up a Debian VM with QEMU and try the bootstrap. I will follow up with how that goes.

Libvirt XML is attached.

live-bootstrap.xml.gz

ajherchenroder commented 5 months ago

I swapped out the SSD and loaded Debian. Then I installed QEMU and made a run. The run completed in a few hours like it used to. I think whatever is going on is an ARCH Linux issue and not a bootstrap problem.

fosslinux commented 4 months ago

Hm, that sounds very plausible. I'll have a look if I have a chance, but I'll close this for now. If it is an us-problem, then feel free to reopen.