Open biergaizi opened 1 year ago
Does adding x2apic_phys=true
to Xen options help?
I had recently updated my bios of an asrock p570 pro4 and had the same problems as you describe. The option x2apic_phys=true
has removed the issue. The usage of SMT is also still possible.
The original bios version 3.20 was not affected by this issue.
So, IIUC, you were able to install the system, just it didn't boot?
I've just experienced a potentially related issue, but my story is a bit different. I haven't done a full installation, just changed the motherboard (same CPU (Ryzen 7 5800X), same SSD, even same chipset (B550), but a different MoBo (ASRock B550 Phantom Gaming 4/AC to ASUS TUF Gaming B550-plus). After changing the MoBo, I needed to run efibootmgr in order to make it bootable, so I tried to boot rescue from the USB installer and failed:
Reached target Basic System
, it hangs.Inspired by this issue, I've disabled SMT and it booted quickly then.
I can do some further experiment with the current MoBo. While I could theoretically also do some experiment with the old MoBo, but I don't wish to change MoBos back and forth.
EDIT: Also, the SMT off in BIOS prevents suspend, or at least the BIOS GUI mentions it.
IIUC, the system rescue doesn't use Xen (xentop just hangs), so it might not be Xen-related.
It does. But xenstored is not running, so most Xen tools wont work (but xl info
and xl dmesg
do, and this is what we care about in the installer).
Qubes OS release
Qubes release 4.1.1 (R4.1)
Brief summary
Due to an upstream Xen issue [1] - currently with no documentation or even a proper upstream bug report - on some AMD Ryzen CPUs / motherboards, IOMMU malfunctions on Xen. One symptom of a broken IOMMU is a system hang during boot at initramfs's splash screen, with "nvme0: I/O 0 QID 0 timeout, completion polled" messages. Other users have also reported boot hanging when using the Qubes installation disc.
One workaround is disabling SMT (hyperthreading) in BIOS. This is harmless in Qubes since Qubes does not use SMT, but without documentation, it's extremely difficult to find this workaround. I spent half an hour searching for this error message before finding a forum post mentioning SMT. This question is also raised at a Xen mailing list but without any response, indicating that the problem should be worked on the upstream first.
This is likely a duplicate of #7620, #7570 or other previously reported issues that I'm not familiar with. However, the disable-SMT workaround has not appeared in any of the existing bug report that I'm aware of. All the existing report was also hardware-specific, but now it's clear that it's a systematic issue. Thus, I propose that it should be treated as a separate lack-of-documentation bug report. Though, other workarounds like
dom0_max_vcpus=1 dom0_vcpus_pin
should also be documented.Affected Hardware
Some examples include:
Steps to reproduce
Install QubesOS onto a NVMe SSD on an Intel motherboard.
Move QubesOS to an AMD AM4 motherboard with X399 or X570 chipset, with an Ryzen 5000 series CPU (Zen 3) installed.
Boot to NVMe. To allow seeing the error messages, now disable plymouth splash screen using root via the commands:
Enable IOMMU in BIOS.
Reboot to NVMe.
OR
Expected behavior
Boot should continue without hanging, the LUKS passphrase prompt should appear and one should be enter QubesOS after typing the passphrase.
Actual behavior
initramfs hangs at splash screen. If plymouth is disabled, after waiting for 3 to 5 minutes, NVMe timeout messages will appear in dmesg and be printed on the screen, similar to:
Workaround
Disable Simultaneous Multi-Threading (SMT) in firmware, via the UEFI BIOS setup screen (SMT is more commonly known by users as Intel's trademark "Hyperthreading", and it's worth mentioning it in the documentation).
Other workarounds include other
dom0_max_vcpus=1 dom0_vcpus_pin
, previously described in other bug reports.References
[1] Hang booting Dom0: nvme timeout, completion polled
https://lists.xenproject.org/archives/html/xen-users/2023-03/msg00001.html
[2] Installer does not boot - nvme timeout completion polled
https://forum.qubes-os.org/t/installer-does-not-boot-nvme-timeout-completion-polled/13639/2
[3] GPD Win Max 2 - Unable to boot installer
https://forum.qubes-os.org/t/gpd-win-max-2-unable-to-boot-installer/14466