Open marmarek opened 5 days ago
How to file a helpful issue
R4.3
When MSI Z790-P desktop (NitroPC Pro 2) additionally has Nvidia graphics plugged in, first boot after installation fails.
Installation works using Intel GPU - where the monitor is connected.
First stage of installation works (anaconda). It seems it fails to init Nvidia at all, but since monitor is connected to Intel, everything is fine.
[ 10.913057] nouveau 0000:01:00.0: vgaarb: deactivate vga console [ 10.913148] Already setup the GSI :16 [ 10.914719] nouveau 0000:01:00.0: NVIDIA GA106 (b76000a1) [ 11.006346] nouveau 0000:01:00.0: bios: version 94.06.25.00.50 [ 11.007410] nouveau 0000:01:00.0: fb: 12288 MiB GDDR6 [ 11.019880] nouveau 0000:01:00.0: sec2(acr): mbox 00000007 00000000 [ 11.019953] nouveau 0000:01:00.0: sec2(acr):AHESASC: boot failed: -5 [ 11.019956] nouveau 0000:01:00.0: acr: init failed, -5 [ 11.020056] nouveau 0000:01:00.0: init failed with -5 [ 11.020171] nouveau: DRM-master:00000000:00000080: init failed with -5 [ 11.020175] nouveau 0000:01:00.0: DRM-master: Device allocation failed: -5 [ 11.023440] nouveau: probe of 0000:01:00.0 failed with error -5
But after reboot, the system crashes on boot. Kernel logs shows a crash around nouveau driver initialization:
[ 4.089340] BUG: kernel NULL pointer dereference, address: 00000000000000d0 [ 4.089344] #PF: supervisor read access in kernel mode [ 4.089347] #PF: error_code(0x0000) - not-present page [ 4.089349] PGD 0 P4D 0 [ 4.089352] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 4.089355] CPU: 5 PID: 563 Comm: (udev-worker) Not tainted 6.6.60-1.qubes.fc41.x86_64 #1 [ 4.089359] Hardware name: Micro-Star International Co., Ltd. MS-7E06/PRO Z790-P WIFI (MS-7E06), BIOS Dasharo (coreboot+UEFI) v0.9.1 01/17/2024 [ 4.089365] RIP: e030:device_del+0x3f/0x3f0 [ 4.089370] Code: 80 00 00 00 53 48 83 ec 20 4c 8b 67 40 65 48 8b 1c 25 28 00 00 00 48 89 5c 24 18 48 89 fb 48 89 ef e8 45 de 4e 00 48 8b 53 48 <0f> b6 82 d0 00 00 00 a8 01 75 09 83 c8 01 88 82 d0 00 00 00 48 89 [ 4.089377] RSP: e02b:ffffc9004111b758 EFLAGS: 00010246 [ 4.089380] RAX: 0000000000000000 RBX: ffff88810b3e1c10 RCX: 0000000000194005 [ 4.089389] RDX: 0000000000000000 RSI: ffffffffc0c548f9 RDI: ffff88810b3e1c90 [ 4.089392] RBP: ffff88810b3e1c90 R08: 0000000000000000 R09: 0000000000000000 [ 4.089395] R10: 0000000000039160 R11: 0000000000000000 R12: 0000000000000000 [ 4.089398] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888102689680 [ 4.089406] FS: 00007ff928052bc0(0000) GS:ffff8881b9940000(0000) knlGS:0000000000000000 [ 4.089529] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.089532] CR2: 00000000000000d0 CR3: 000000010d6ca000 CR4: 0000000000050660 [ 4.089537] Call Trace: [ 4.089539] <TASK> [ 4.089541] ? __die+0x23/0x70 [ 4.089545] ? page_fault_oops+0x94/0x190 [ 4.089548] ? exc_page_fault+0x7f/0x180 [ 4.089552] ? asm_exc_page_fault+0x26/0x30 [ 4.089556] ? device_del+0x3f/0x3f0 [ 4.089558] ? device_del+0x3b/0x3f0 [ 4.089561] platform_device_del+0x25/0x90 [ 4.089564] platform_device_unregister+0x12/0x30 [ 4.089568] sysfb_disable+0x2f/0x80 [ 4.089572] aperture_remove_conflicting_pci_devices+0x8c/0xa0 [ 4.089576] nouveau_drm_probe+0xa4/0x280 [nouveau] [ 4.089653] ? rpm_resume+0x31a/0x800 [ 4.089656] local_pci_probe+0x42/0xa0 [ 4.089660] pci_call_probe+0x52/0x160 [ 4.089663] pci_device_probe+0x81/0x150 [ 4.089666] ? driver_sysfs_add+0x57/0xc0 [ 4.089668] really_probe+0x19b/0x3e0 [ 4.089671] ? __pfx___driver_attach+0x10/0x10 [ 4.089674] __driver_probe_device+0x78/0x160 [ 4.089677] driver_probe_device+0x1f/0xa0 [ 4.089679] __driver_attach+0xba/0x1c0 [ 4.089681] bus_for_each_dev+0x8c/0xe0 [ 4.089685] bus_add_driver+0x112/0x240 [ 4.089688] driver_register+0x5c/0x100 [ 4.089691] ? __pfx_nouveau_drm_init+0x10/0x10 [nouveau] [ 4.089748] do_one_initcall+0x5a/0x320 [ 4.089752] do_init_module+0x60/0x230 [ 4.089755] init_module_from_file+0x86/0xc0 [ 4.089759] idempotent_init_module+0x121/0x320 [ 4.089762] __x64_sys_finit_module+0x5e/0xb0 [ 4.089765] do_syscall_64+0x5a/0x80 [ 4.089800] ? vfs_read+0x271/0x340 [ 4.089803] ? vfs_read+0x271/0x340 [ 4.089805] ? ksys_read+0x6d/0xf0 [ 4.089808] ? syscall_exit_to_user_mode+0x22/0x40 [ 4.089811] ? clear_bhb_loop+0x55/0xb0 [ 4.089818] ? clear_bhb_loop+0x55/0xb0 [ 4.089820] ? clear_bhb_loop+0x55/0xb0 [ 4.089823] ? clear_bhb_loop+0x55/0xb0 [ 4.089825] ? clear_bhb_loop+0x55/0xb0 [ 4.089827] entry_SYSCALL_64_after_hwframe+0x78/0xe2 [ 4.089831] RIP: 0033:0x7ff9288f431d [ 4.089834] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 5a 0f 00 f7 d8 64 89 01 48 [ 4.089841] RSP: 002b:00007fff3bf3b8a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 4.089845] RAX: ffffffffffffffda RBX: 000057305029dc20 RCX: 00007ff9288f431d [ 4.089848] RDX: 0000000000000000 RSI: 00007ff92740e3bd RDI: 000000000000002b [ 4.089851] RBP: 00007fff3bf3b960 R08: 0000000000000001 R09: 00007fff3bf3b8f0 [ 4.089854] R10: 0000000000000040 R11: 0000000000000246 R12: 00007ff92740e3bd [ 4.089858] R13: 0000000000020000 R14: 0000573050290e70 R15: 00005730502a3b00 [ 4.089861] </TASK> [ 4.089863] Modules linked in: crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic nouveau(+) i915(+) ghash_clmulni_intel drm_ttm_helper drm_exec sha512_ssse3 gpu_sched mxm_wmi xhci_pci sha256_ssse3 drm_buddy i2c_algo_bit video xhci_pci_renesas wmi sha1_ssse3 nvme ttm xhci_hcd nvme_core drm_display_helper nvme_common cec pinctrl_alderlake serio_raw xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput [ 4.089888] CR2: 00000000000000d0 [ 4.089895] ---[ end trace 0000000000000000 ]--- [ 4.089898] RIP: e030:device_del+0x3f/0x3f0 [ 4.089900] Code: 80 00 00 00 53 48 83 ec 20 4c 8b 67 40 65 48 8b 1c 25 28 00 00 00 48 89 5c 24 18 48 89 fb 48 89 ef e8 45 de 4e 00 48 8b 53 48 <0f> b6 82 d0 00 00 00 a8 01 75 09 83 c8 01 88 82 d0 00 00 00 48 89 [ 4.089907] RSP: e02b:ffffc9004111b758 EFLAGS: 00010246 [ 4.089910] RAX: 0000000000000000 RBX: ffff88810b3e1c10 RCX: 0000000000194005 [ 4.089913] RDX: 0000000000000000 RSI: ffffffffc0c548f9 RDI: ffff88810b3e1c90 [ 4.089917] RBP: ffff88810b3e1c90 R08: 0000000000000000 R09: 0000000000000000 [ 4.089920] R10: 0000000000039160 R11: 0000000000000000 R12: 0000000000000000 [ 4.089923] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888102689680 [ 4.089928] FS: 00007ff928052bc0(0000) GS:ffff8881b9940000(0000) knlGS:0000000000000000 [ 4.089932] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.089935] CR2: 00000000000000d0 CR3: 000000010d6ca000 CR4: 0000000000050660 [ 4.089940] Kernel panic - not syncing: Fatal exception [ 4.089956] Kernel Offset: disabled (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
After few reboots, it hangs in similar place:
[ 37.338968] watchdog: BUG: soft lockup - CPU#11 stuck for 26s! [(udev-worker):1193] [ 37.339044] Modules linked in: nvme_tcp nvme_fabrics nouveau(+) crct10dif_pclmul crc32_pclmul i915(+) crc32c_intel polyval_clmulni polyval_generic mxm_wmi drm_exec gpu_sched drm_buddy 8021q video ghash_clmulni_intel garp wmi i2c_algo_bit mrp sha512_ssse3 xhci_pci stp drm_display_helper llc xhci_pci_renesas sha256_ssse3 rfkill sha1_ssse3 igc nvme cec drm_ttm_helper xhci_hcd nvme_core ttm nvme_common pinctrl_alderlake serio_raw sunrpc dm_crypt dm_round_robin raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 iscsi_ibft scsi_dh_hp_sw squashfs be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi edd xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua ip6_tables ip_tables dm_multipath [ 37.339115] CPU: 11 PID: 1193 Comm: (udev-worker) Not tainted 6.6.60-1.qubes.fc41.x86_64 #1 [ 37.339120] Hardware name: Micro-Star International Co., Ltd. MS-7E06/PRO Z790-P WIFI (MS-7E06), BIOS Dasharo (coreboot+UEFI) v0.9.1 01/17/2024 [ 37.339126] RIP: e030:xen_hypercall_sched_op+0xa/0x20 [ 37.339133] Code: 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc [ 37.339262] RSP: e02b:ffffc90040fbb928 EFLAGS: 00000202 [ 37.339265] RAX: 0000000000000000 RBX: ffff8881b9adff40 RCX: ffffffff812053aa [ 37.339270] RDX: ffff888101297a28 RSI: ffffc90040fbb948 RDI: 0000000000000003 [ 37.339273] RBP: ffff888102084898 R08: 00000000000000c9 R09: ffff8881188d4000 [ 37.339277] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000000c9 [ 37.339281] R13: ffff8881b9af4ec0 R14: 0000000000300000 R15: 0000000000000000 [ 37.339288] FS: 00007c6e4a1cabc0(0000) GS:ffff8881b9ac0000(0000) knlGS:0000000000000000 [ 37.339293] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 37.339296] CR2: 00007c6e4a09f8df CR3: 000000011f95c000 CR4: 0000000000050660 [ 37.339302] Call Trace: [ 37.339304] <IRQ> [ 37.339306] ? watchdog_timer_fn+0x1b1/0x220 [ 37.339312] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 37.339316] ? __hrtimer_run_queues+0x12f/0x2a0 [ 37.339321] ? hrtimer_interrupt+0xf8/0x230 [ 37.339325] ? xen_timer_interrupt+0x1f/0x30 [ 37.339329] ? __handle_irq_event_percpu+0x47/0x1a0 [ 37.339334] ? handle_irq_event_percpu+0x13/0x40 [ 37.339338] ? handle_percpu_irq+0x3b/0x60 [ 37.339341] ? generic_handle_irq+0x40/0x60 [ 37.339345] ? __evtchn_fifo_handle_events+0x1b4/0x1e0 [ 37.339350] ? xen_evtchn_do_upcall+0x6d/0xc0 [ 37.339355] ? __xen_pv_evtchn_do_upcall+0x21/0x30 [ 37.339358] ? xen_pv_evtchn_do_upcall+0x85/0xb0 [ 37.339362] </IRQ> [ 37.339364] <TASK> [ 37.339366] ? exc_xen_hypervisor_callback+0x8/0x20 [ 37.339370] ? xen_hypercall_sched_op+0xa/0x20 [ 37.339375] ? xen_hypercall_sched_op+0xa/0x20 [ 37.339378] ? xen_poll_irq+0x7d/0xc0 [ 37.339382] ? xen_qlock_wait+0x83/0x90 [ 37.339386] ? __pv_queued_spin_lock_slowpath+0x32d/0x360 [ 37.339390] ? _raw_spin_lock+0x29/0x30 [ 37.339393] ? __mutex_lock.constprop.0+0x130/0x750 [ 37.339398] ? acpi_pci_irq_lookup+0x3b/0x250 [ 37.339403] ? device_del+0x3b/0x3f0 [ 37.339406] ? platform_device_del+0x25/0x90 [ 37.339412] ? platform_device_unregister+0x12/0x30 [ 37.339416] ? sysfb_disable+0x2f/0x80 [ 37.339419] ? aperture_remove_conflicting_pci_devices+0x8c/0xa0 [ 37.339424] ? nouveau_drm_probe+0xa4/0x280 [nouveau] [ 37.339561] ? rpm_resume+0x31a/0x800 [ 37.339564] ? local_pci_probe+0x42/0xa0 [ 37.339568] ? pci_call_probe+0x52/0x160 [ 37.339572] ? pci_device_probe+0x81/0x150 [ 37.339576] ? driver_sysfs_add+0x57/0xc0 [ 37.339578] ? really_probe+0x19b/0x3e0 [ 37.339581] ? __pfx___driver_attach+0x10/0x10 [ 37.339585] ? __driver_probe_device+0x78/0x160 [ 37.339588] ? driver_probe_device+0x1f/0xa0 [ 37.339591] ? __driver_attach+0xba/0x1c0 [ 37.339594] ? bus_for_each_dev+0x8c/0xe0 [ 37.339598] ? bus_add_driver+0x112/0x240 [ 37.339602] ? driver_register+0x5c/0x100 [ 37.339605] ? __pfx_nouveau_drm_init+0x10/0x10 [nouveau] [ 37.339675] ? do_one_initcall+0x5a/0x320 [ 37.339679] ? do_init_module+0x60/0x230 [ 37.339683] ? init_module_from_file+0x86/0xc0 [ 37.339687] ? idempotent_init_module+0x121/0x320 [ 37.339691] ? __x64_sys_finit_module+0x5e/0xb0 [ 37.339695] ? do_syscall_64+0x5a/0x80 [ 37.339698] ? syscall_exit_to_user_mode+0x22/0x40 [ 37.339702] ? do_syscall_64+0x66/0x80 [ 37.339705] ? clear_bhb_loop+0x55/0xb0 [ 37.339708] ? clear_bhb_loop+0x55/0xb0 [ 37.339711] ? clear_bhb_loop+0x55/0xb0 [ 37.339714] ? clear_bhb_loop+0x55/0xb0 [ 37.339717] ? clear_bhb_loop+0x55/0xb0 [ 37.339720] ? entry_SYSCALL_64_after_hwframe+0x78/0xe2 [ 37.339724] </TASK>
The same happens when booting Linux directly (without Xen)
Linux 6.11.6 does not have this issue.
How to file a helpful issue
Qubes OS release
R4.3
Brief summary
When MSI Z790-P desktop (NitroPC Pro 2) additionally has Nvidia graphics plugged in, first boot after installation fails.
Steps to reproduce
Expected behavior
Installation works using Intel GPU - where the monitor is connected.
Actual behavior
First stage of installation works (anaconda). It seems it fails to init Nvidia at all, but since monitor is connected to Intel, everything is fine.
But after reboot, the system crashes on boot. Kernel logs shows a crash around nouveau driver initialization:
After few reboots, it hangs in similar place: