netbootxyz / netboot.xyz

Your favorite operating systems in one place. A network-based bootable operating system installer based on iPXE.
https://netboot.xyz
Apache License 2.0
9.38k stars 690 forks source link

Since 2.0.70, not able to boot via QEMU / libvirt #1273

Closed tobiashochguertel closed 1 year ago

tobiashochguertel commented 1 year ago

Describe the bug After selecting an OS to boot, it gets loaded but then when it should boot, the virtual machine resets without an error message. I attached a video from the boot process and hope that helps.

I tested master, development and v. 2.0.68, v. 2.0.69 and v. 2.0.70. Only the latest version of netboot.xyz is affected by this behavior (master, development, v. 2.0.70). Previous Versions v. 2.0.68 and v. 2.0.69 are not affected and work great with QEMU/libvirt.

To Reproduce Steps to reproduce the behavior:

  1. RedHat Cockpit or Proxmox VE, create a new Virtual Machine, boot via PXE (netboot.xyz v.2.0.70)
  2. Deactivate Signature Check (set it to false)
  3. Select "Linux Network Installations (64bit)"
  4. Select "Fedora Core OS"
  5. Select "Stable"

https://github.com/netbootxyz/netboot.xyz/assets/3332669/73018db8-1cef-41e7-8dff-1d3c88932110

I see this issue mostly with virtualization QEMU/libvirt. A Thin Client from Fujitsu doesn't have the issue.

With the previous Version v.2.0.69 it works with QEMU/libvirt and Thin Client from Fujitsu.

Expected behavior The OS will boot.

Additional context Discord Thread, where I have written down some details

rufo commented 1 year ago

For whatever it's worth, I seem to have the same thing. I thought it might be limited to just Proxmox, but trying to netboot a Beelink EQ12 had identical behavior 🤔 Rolling back to 2.0.69 works better in Proxmox for me; I haven't tried it on the EQ12 directly, though I'm likely to sometime over this weekend, and maybe noodle around a bit to see if I can identify any differences between .69 and .70.

I should note also that I'm using the LinuxServer.io docker image to host a local version of the netboot.xyz, though the only modifications I've made are to windows.ipxe and the necessary set of files to boot a Windows installation environment. I don't think that would make any difference to the boot process, since it's definitely downloading the files from the appropriate sources, but figured it was worth noting in the interest of clarity.

EDIT: Went back up to .70 and I can't seem to replicate this on the EQ12 again, even though I spent enough time on it that I had to load a whole bunch of images on to the PiKVM the other day 🤔 I'll test more with Proxmox tomorrow.

rufo commented 1 year ago

A little more information here: it seems that the Proxmox problem is specifically using the VirtIO network interface with 2.0.70. If I change the VM's network card to be an Intel E1000, it gets past kernel/initrd downloading on 2.0.70, and if I downgrade to 2.0.69, it will do so with a VirtIO NIC.

ZiXia1 commented 1 year ago

My vps has the same condition as yours, he will enter netboot again after selecting the system

antonym commented 1 year ago

The iPXE SHA for 2.0.70 was https://github.com/ipxe/ipxe/commit/6f57d919357a43507935a5ea78a66702ac0f3d54 and the one for 2.0.69 was https://github.com/ipxe/ipxe/commit/03eea19c19b52851002654c2818b765d4aa42894

The diff is: https://github.com/ipxe/ipxe/compare/03eea19c19b52851002654c2818b765d4aa42894...6f57d919357a43507935a5ea78a66702ac0f3d54

So it's possible something in upstream iPXE may have changed. I don't have the cycles to dig into it right now, but if you notice something in your testing, please open an issue with IPXE.

ZiXia1 commented 1 year ago

The iPXE SHA for 2.0.70 was ipxe/ipxe@6f57d91 and the one for 2.0.69 was ipxe/ipxe@03eea19

The diff is: ipxe/ipxe@03eea19...6f57d91

So it's possible something in upstream iPXE may have changed. I don't have the cycles to dig into it right now, but if you notice something in your testing, please open an issue with IPXE.

Hi, I found the problem should be in the code on May 31, 2023 this day. https://github.com/netbootxyz/netboot.xyz/commits/development?after=42f21802c80524a63b3a2f1234a4ca8fbcc411fa+209&branch=development&qualified_name=refs%2Fheads%2Fdevelopment

I first tested at https://github.com/netbootxyz/netboot.xyz/commit/6c23ec74577a97a36a349b2390084377979ff02c Cannot boot properly. When I test https://github.com/netbootxyz/netboot.xyz/commit/08e265a6d464267f828bfdbe0b7a7fd403e43ef6 Can boot normally

So I think the problem should be in the code on May 31, 2023

louishot commented 1 year ago

I have the same problem, in Proxmox it always in restart loop, in Vultr it freeze after Init like this

http://mirror.cogentco.com/debian/dists/bullseye/main/installer-amd64/current/images/netboot/debian-installer/amd64/linux ok
http://mirror.cogentco.com/debian/dists/bullseye/main/installer-amd64/current/images/netboot/debian-installer/amd64/initrd.gz ok
bradreaves commented 1 year ago

A little more information here: it seems that the Proxmox problem is specifically using the VirtIO network interface with 2.0.70. If I change the VM's network card to be an Intel E1000, it gets past kernel/initrd downloading on 2.0.70, and if I downgrade to 2.0.69, it will do so with a VirtIO NIC.

FWIW, I found this thread because I'm getting the same thing, also running QEMU emulator version 7.2.0 (pve-qemu-kvm_7.2.0-8) on Proxmox.

antonym commented 1 year ago

I was able to reproduce this finally in Proxmox. Doing some testing to revert some code around that date to see if it alleviates the issue. I noticed it only seemed to happen when in Legacy mode and not UEFI which is where I was primarily testing. Thanks to @ZiXia1 for giving me a starting point to start digging into it.

antonym commented 1 year ago

Give this build a try, it solved the issue for me in Proxmox. I reverted some cross-compiling changes with iPXE and that seemed to fix the problem:

https://s3.amazonaws.com/dev.boot.netboot.xyz/79577f556c6f15af950d89d4ad842c51c81e3a3e/index.html

If that helps solve some issues, I'll go ahead and cut a new release.

antonym commented 1 year ago

2.0.72 released, should be building and be live in the next hour. It should help fix this particular issue. If there are still problems, feel free to reopen or comment here.