QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
536 stars 48 forks source link

sys-net cannot be started (shuts down very soon) after installation of 4.2.2 and 4.2.3 #9495

Open sjvudp opened 2 weeks ago

sjvudp commented 2 weeks ago

How to file a helpful issue

Qubes OS release

4.2.3

Brief summary

I installed Qubes OS 4.2.2 on an external NVME SSD connected via USB (I had used the same device successfully for n older Qubes OS on a different computer). After reboot there was an error displayed.

grafik

Installation finished, but sys-net would shutdown itself rather quickly, and without sys-net no qube can be started. Obviously anything fixable through a downloadable update would not work.

Maybe it has to do with some disk that cannot be mounted, like this:

grafik

Maybe this message causes it, but I don't know:

grafik

I suspect that it might also be CPU-related, as the Laptop (Lenovo ThinkPad P16 Gen2) has an Intel i9-13980HX CPU. As other Linux shows 32 CPUs, and the machine has 96GB RAM, it would be a nice machine to run Qubes OS, I guess.

Here are some more details:

grafik

After that I connected the NVME device to the other computer that ran Qubes OS before, but there also sys-net won't start, so maybe it's not really a CPU issue. The only difference was that I had to repartition the NVME for GPT (before it was MBR) as the new laptop does not support "legacy boot" any more.

After that I downloaded the latest 4.2.3 installation medium, installed Qubes OSagain, but the result was the very same: sys-net terminates rather quickly.

Steps to reproduce

Install Qubes OS

Expected behavior

Qubes OS starts after installation

Actual behavior

Qubes OS is unusable after installation.

andrewdavidwong commented 2 weeks ago

Duplicate of #9440?

sjvudp commented 2 weeks ago

Duplicate of #9440?

Well, I read that, but an incorrect system time causing qubes to boot would be a very odd bug. Besides that, the BIOS clock may be off up to 2 hours (UTC vs. CEST).

marmarek commented 2 weeks ago

2h is not a problem; several weeks/months is

sjvudp commented 2 weeks ago

What type of log (if any) would help to diagnose (and more importantly: fix) the problem?

sjvudp commented 2 weeks ago

OK, I managed to mount the disk in my openSUSE system, so I can provide some more details.

Inspecting "rpm -qa --last", the oldest package was installed at "Mon Oct 7 20:33:40 202" (that last on "Mon Oct 7 20:39:43 2024"), so the time/date should not be an issue.

Inspecting the journal for unusual messages, I found these:

Oct 07 20:42:47 localhost kernel: pci 10000:e0:06.0: Failed to add - passthrough or MSI/MSI-X might fail!
Oct 07 20:42:47 localhost kernel: ucsi_acpi USBC000:00: UCSI_GET_PDOS failed (-95)
Oct 07 20:42:47 localhost kernel: nouveau 0000:01:00.0: unknown chipset (197000a1)

( I have a AD107GLM [RTX 2000 Ada Generation Laptop GPU] , BTW)

Oct 07 20:42:48 localhost kernel: tmpfs: Unsupported parameter 'huge'
Oct 07 20:43:13 localhost kernel: Bluetooth: hci0: Malformed MSFT vendor event: 0x02
Oct 07 20:43:14 localhost fedora-dmraid-activation[4858]: /lib/systemd/fedora-dmraid-activation: line 5: /etc/init.d/functions: No such file or directory
Oct 07 20:43:14 localhost fedora-dmraid-activation[4860]: /lib/systemd/fedora-dmraid-activation: line 9: strstr: command not found
Oct 07 20:43:14 localhost fedora-dmraid-activation[4863]: Removed "/etc/systemd/system/sysinit.target.wants/dmraid-activation.service".

Oct 07 20:43:20 localhost anaconda[7346]: anaconda: ui.gui.hubs: Initialization controller for hub InitialSetupMainHub expected but missing.

Oct 07 20:47:49 dom0 sudo[8085]: PAM unable to dlopen(/usr/lib64/security/pam_sss.so): /usr/lib64/security/pam_sss.so: cannot open shared object file: No such file or directory
Oct 07 20:47:49 dom0 sudo[8085]: PAM adding faulty module: /usr/lib64/security/pam_sss.so
Oct 07 20:52:09 dom0 kernel: xen-blkback: backend/vbd/1/51712: using 2 queues, protocol 1 (x86_64-abi) persistent grants
Oct 07 20:52:09 dom0 kernel: xen-blkback: backend/vbd/1/51728: using 2 queues, protocol 1 (x86_64-abi) persistent grants
Oct 07 20:52:09 dom0 kernel: xen-blkback: backend/vbd/1/51744: using 2 queues, protocol 1 (x86_64-abi) persistent grants
Oct 07 20:52:09 dom0 kernel: xen-blkback: backend/vbd/1/51760: using 2 queues, protocol 1 (x86_64-abi) persistent grants
Oct 07 20:52:10 dom0 runuser[8953]: pam_unix(runuser:session): session closed for user master
Oct 07 20:52:10 dom0 audit[8953]: USER_END pid=8953 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:session_close grantors=pam_keyinit,pam_limits,pam_unix acct="master" exe="/usr/sbin/runuser" hostname=? addr=? terminal=? res=success'
Oct 07 20:52:10 dom0 audit[8953]: CRED_DISP pid=8953 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_rootok acct="master" exe="/usr/sbin/runuser" hostname=? addr=? terminal=? res=success'
Oct 07 20:52:10 dom0 qubesd[5000]: vm.debian-12-xfce: Start failed: Cannot connect to qrexec agent for 60 seconds, see /var/log/xen/console/guest-debian-12-xfce.log for details
Oct 07 21:11:35 dom0 kernel: xen-blkback: backend/vbd/5/51712: using 2 queues, protocol 1 (x86_64-abi) persistent grants
Oct 07 21:11:35 dom0 kernel: xen-blkback: backend/vbd/5/51728: using 2 queues, protocol 1 (x86_64-abi) persistent grants
Oct 07 21:11:35 dom0 kernel: xen-blkback: backend/vbd/5/51744: using 2 queues, protocol 1 (x86_64-abi) persistent grants
Oct 07 21:11:35 dom0 kernel: xen-blkback: backend/vbd/5/51760: using 2 queues, protocol 1 (x86_64-abi) persistent grants
Oct 07 21:11:36 dom0 libvirtd[5033]: libvirt version: 8.9.0, package: 7.fc37 (Unknown, 2024-05-01-19:12:11, )
Oct 07 21:11:36 dom0 libvirtd[5033]: hostname: localhost
Oct 07 21:11:36 dom0 libvirtd[5033]: internal error: libxenlight failed to create new domain 'sys-firewall'
Oct 07 21:11:36 dom0 qubesd[5000]: vm.sys-firewall: Start failed: internal error: libxenlight failed to create new domain 'sys-firewall'

Messages in /var/log/libvirt/libxl/libxl-driver.log are:

2024-10-07 19:11:36.011+0000: libxl: libxl_device.c:1200:device_backend_callback: Domain 7:unable to add device with path /local/domain/5/backend/vif/7/0
2024-10-07 19:11:36.011+0000: libxl: libxl_create.c:2000:domcreate_attach_devices: Domain 7:unable to add vif devices

In /var/log/xen/console/guest-sys-net-dm.log I found: ...

[2024-10-07 21:11:32] from-unix: {"QMP": {"version": {"qemu": {"micro": 2, "minor": 1, "major": 8}, "package": ""}, "capabilities": ["oob"]}}
[2024-10-07 21:11:32] 
[2024-10-07 21:11:32] wrote 110 bytes to vchan
[2024-10-07 21:11:32] from-vchan: ^A{"execute":"qmp_capabilities","id":2020372737}
[2024-10-07 21:11:32] 
[2024-10-07 21:11:32] from-unix: {"return": {}, "id": 2020372737}
[2024-10-07 21:11:32] 
[2024-10-07 21:11:32] wrote 34 bytes to vchan
[2024-10-07 21:11:32] from-vchan: {"execute":"device_add","id":2020372736,"arguments":{"driver":"xen-pci-passthrough","id":"pci-pt-00_14.3","hostaddr":"0000:00:14.3"}}
[2024-10-07 21:11:32] 
[2024-10-07 21:11:32] [00:06.0] xen_pt_realize: Assigning real physical device 00:14.3 to devfn 0x30
[2024-10-07 21:11:32] [00:06.0] xen_pt_register_regions: IO region 0 registered (size=0x00004000 base_addr=0x6256634000 type: 0x4)
[2024-10-07 21:11:32] [00:06.0] xen_pt_config_reg_init: Offset 0x0010 mismatch! Emulated=0x0000, host=0x56634004, syncing to 0x56634004.
[2024-10-07 21:11:32] [00:06.0] xen_pt_config_reg_init: Offset 0x0014 mismatch! Emulated=0x0000, host=0x0062, syncing to 0x0062.
[2024-10-07 21:11:32] [00:06.0] xen_pt_config_reg_init: Offset 0x00ca mismatch! Emulated=0x0000, host=0x0023, syncing to 0x0023.
[2024-10-07 21:11:32] [00:06.0] xen_pt_pm_ctrl_reg_init_off: PCI power management control passthrough is off
[2024-10-07 21:11:32] [00:06.0] xen_pt_config_reg_init: Offset 0x00d2 mismatch! Emulated=0x0000, host=0x0080, syncing to 0x0080.
[2024-10-07 21:11:32] [00:06.0] xen_pt_config_reg_init: Offset 0x0044 mismatch! Emulated=0x0000, host=0x10000ec0, syncing to 0x0ec0.
[2024-10-07 21:11:32] [00:06.0] xen_pt_config_reg_init: Offset 0x004a mismatch! Emulated=0x0000, host=0x0010, syncing to 0x0010.
[2024-10-07 21:11:32] [00:06.0] xen_pt_msix_init: get MSI-X table BAR base 0x6256634000
[2024-10-07 21:11:32] [00:06.0] xen_pt_config_reg_init: Offset 0x0082 mismatch! Emulated=0x0000, host=0x000f, syncing to 0x000f.
[2024-10-07 21:11:32] [00:06.0] xen_pt_pci_intx: intx=3
[2024-10-07 21:11:32] [00:06.0] xen_pt_realize: Real physical device 00:14.3 registered successfully
[2024-10-07 21:11:32] 
[2024-10-07 21:11:32] ==== Press enter for shell ====

...

In var/log/xen/console/guest-sys-net.log I found: ...

[2024-10-07 21:11:35] [    0.960624] Run /init as init process
[2024-10-07 21:11:35] Qubes initramfs script here:
[2024-10-07 21:11:35] [    0.964147] Invalid max_queues (4), will use default max: 2.
[2024-10-07 21:11:35] [    0.970856] blkfront: xvda: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled
[2024-10-07 21:11:35] [    0.973361] blkfront: xvdb: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled
[2024-10-07 21:11:35] [    0.975636] blkfront: xvdc: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled
[2024-10-07 21:11:35] [    0.977760] blkfront: xvdd: barrier or flush: disabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled
[2024-10-07 21:11:35] Waiting for /dev/xvda* devices...
[2024-10-07 21:11:35] Qubes: Doing R/W setup for TemplateVM...
[2024-10-07 21:11:35] [    1.352305]  xvdc: xvdc1 xvdc3
[2024-10-07 21:11:35] Setting up swapspace version 1, size = 1073737728 bytes
[2024-10-07 21:11:35] UUID=28eee439-a52f-4676-aecb-6ff3a8eae73d
[2024-10-07 21:11:35] Qubes: done.
[2024-10-07 21:11:35] mount: mounting /dev/mapper/dmroot on /sysroot failed: Invalid argument
[2024-10-07 21:11:35] Waiting for /dev/xvdd device...
[2024-10-07 21:11:35] [    1.383884] EXT4-fs (xvdd): mounting ext3 file system using the ext4 subsystem
[2024-10-07 21:11:35] [    1.386592] EXT4-fs (xvdd): mounted filesystem 212ddd52-f4ea-4d93-9ad8-d88854e31cc4 ro with ordered data mode. Quota mode: none.
[2024-10-07 21:11:35] mount: mounting none on /sysroot/lib/modules failed: No such file or directory
[2024-10-07 21:11:35] [    1.397507] EXT4-fs (xvdd): unmounting filesystem 212ddd52-f4ea-4d93-9ad8-d88854e31cc4.
[2024-10-07 21:11:35] mount: can't read '/proc/mounts': No such file or directory
[2024-10-07 21:11:35] BusyBox v1.36.0 (2023-01-10 00:00:00 UTC) multi-call binary.
[2024-10-07 21:11:35] 
[2024-10-07 21:11:35] Usage: switch_root [-c CONSOLE_DEV] NEW_ROOT NEW_INIT [ARGS]
[2024-10-07 21:11:35] 
[2024-10-07 21:11:35] Free initramfs and switch to another root fs:
[2024-10-07 21:11:35] chroot to NEW_ROOT, delete all in /, move NEW_ROOT to /,
[2024-10-07 21:11:35] execute NEW_INIT. PID must be 1. NEW_ROOT must be a mountpoint.
[2024-10-07 21:11:35] 
[2024-10-07 21:11:35]   -c DEV  Reopen stdio to DEV after switch
[2024-10-07 21:11:35] [    1.404139] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
[2024-10-07 21:11:35] [    1.404160] CPU: 1 PID: 1 Comm: switch_root Not tainted 6.6.48-1.qubes.fc37.x86_64 #1
[2024-10-07 21:11:35] [    1.404179] Hardware name: Xen HVM domU, BIOS 4.17.4 07/20/2024
[2024-10-07 21:11:35] [    1.404195] Call Trace:
[2024-10-07 21:11:35] [    1.404205]  <TASK>
[2024-10-07 21:11:35] [    1.404213]  dump_stack_lvl+0x47/0x60
[2024-10-07 21:11:35] [    1.404229]  panic+0x345/0x360
[2024-10-07 21:11:35] [    1.404243]  do_exit+0x4ea/0x510
[2024-10-07 21:11:35] [    1.404255]  ? do_syscall_64+0x65/0x80
[2024-10-07 21:11:35] [    1.404268]  do_group_exit+0x31/0x80
[2024-10-07 21:11:35] [    1.404279]  __x64_sys_exit_group+0x18/0x20
[2024-10-07 21:11:35] [    1.404289]  do_syscall_64+0x59/0x80
[2024-10-07 21:11:35] [    1.404300]  ? syscall_exit_to_user_mode+0x22/0x40
[2024-10-07 21:11:35] [    1.404316]  ? do_syscall_64+0x65/0x80
[2024-10-07 21:11:35] [    1.404327]  ? do_user_addr_fault+0x323/0x630
[2024-10-07 21:11:35] [    1.404341]  ? exc_page_fault+0x77/0x170
[2024-10-07 21:11:35] [    1.404353]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[2024-10-07 21:11:35] [    1.404368] RIP: 0033:0x423f5e
[2024-10-07 21:11:35] [    1.404380] Code: c3 12 00 48 8b 35 6a c3 12 00 48 8b 05 7b c3 12 00 e9 7c fe ff ff 66 0f 1f 44 00 00 f3 0f 1e fa 48 63 ff b8 e7 00 00 00 0f 05 <ba> 3c 00 00 00 0f 1f 44 00 00 48 89 d0 0f 05 eb f9 90 41 56 41 55

...

/var/log/xen/qmp-proxy-sys-net-dm.log is empty.

The VG looks like this:

# vgs
  VG         #PV #LV #SN Attr   VSize   VFree 
  qubes_dom0   1  36   0 wz--n- 159.98g 20.00m

The LVs look like this:

# lvs
  LV                                               VG         Attr       LSize   Pool      Origin                                           Data%  Meta%  Move Log Cpy%Sync Convert
  root                                             qubes_dom0 Vwi-aotz--  20.00g root-pool                                                  23.08                                  
  root-pool                                        qubes_dom0 twi-aotz--  20.00g                                                            23.08  18.34                           
  swap                                             qubes_dom0 -wi-a-----   4.00g                                                                                                   
  vm                                               qubes_dom0 Vwi-a-tz-- 135.81g vm-pool                                                    0.75                                   
  vm-anon-whonix-private                           qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-debian-12-xfce-private                        qubes_dom0 Vwi-a-tz--   2.00g vm-pool   vm-debian-12-xfce-private-1728327130-back        0.00                                   
  vm-debian-12-xfce-private-1728327130-back        qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-debian-12-xfce-root                           qubes_dom0 Vwi-a-tz--  20.00g vm-pool   vm-debian-12-xfce-root-1728327130-back           25.50                                  
  vm-debian-12-xfce-root-1728327126-back           qubes_dom0 Vwi-a-tz--  10.00g vm-pool                                                    0.00                                   
  vm-debian-12-xfce-root-1728327130-back           qubes_dom0 Vwi-a-tz--  20.00g vm-pool                                                    25.50                                  
  vm-default-dvm-private                           qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-default-mgmt-dvm-private                      qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-fedora-40-xfce-private                        qubes_dom0 Vwi-a-tz--   2.00g vm-pool   vm-fedora-40-xfce-private-1728327631-back        0.00                                   
  vm-fedora-40-xfce-private-1728327631-back        qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-fedora-40-xfce-root                           qubes_dom0 Vwi-a-tz--  20.00g vm-pool   vm-fedora-40-xfce-root-1728327631-back           25.64                                  
  vm-fedora-40-xfce-root-1728327627-back           qubes_dom0 Vwi-a-tz--  10.00g vm-pool                                                    0.00                                   
  vm-fedora-40-xfce-root-1728327631-back           qubes_dom0 Vwi-a-tz--  20.00g vm-pool                                                    25.64                                  
  vm-personal-private                              qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-pool                                          qubes_dom0 twi-aotz-- 135.81g                                                            11.92  13.91                           
  vm-sys-net-private                               qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-sys-net-private-1728329588-back               qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-sys-whonix-private                            qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-untrusted-private                             qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-vault-private                                 qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-whonix-gateway-17-private                     qubes_dom0 Vwi-a-tz--   2.00g vm-pool   vm-whonix-gateway-17-private-1728327871-back     0.00                                   
  vm-whonix-gateway-17-private-1728327871-back     qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-whonix-gateway-17-root                        qubes_dom0 Vwi-a-tz--  20.00g vm-pool   vm-whonix-gateway-17-root-1728327871-back        9.82                                   
  vm-whonix-gateway-17-root-1728327867-back        qubes_dom0 Vwi-a-tz--  10.00g vm-pool                                                    0.00                                   
  vm-whonix-gateway-17-root-1728327871-back        qubes_dom0 Vwi-a-tz--  20.00g vm-pool                                                    9.82                                   
  vm-whonix-workstation-17-dvm-private             qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-whonix-workstation-17-private                 qubes_dom0 Vwi-a-tz--   2.00g vm-pool   vm-whonix-workstation-17-private-1728328200-back 0.00                                   
  vm-whonix-workstation-17-private-1728328200-back qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   
  vm-whonix-workstation-17-root                    qubes_dom0 Vwi-a-tz--  20.00g vm-pool   vm-whonix-workstation-17-root-1728328200-back    14.91                                  
  vm-whonix-workstation-17-root-1728328196-back    qubes_dom0 Vwi-a-tz--  10.00g vm-pool                                                    0.00                                   
  vm-whonix-workstation-17-root-1728328200-back    qubes_dom0 Vwi-a-tz--  20.00g vm-pool                                                    14.91                                  
  vm-work-private                                  qubes_dom0 Vwi-a-tz--   2.00g vm-pool                                                    0.00                                   

The failure messages for the Debian-12 were like this: ...

2024-10-07 20:52:09] [    0.598369] xenbus_probe_frontend: Device with no driver: device/vbd/51712
[2024-10-07 20:52:09] [    0.598386] xenbus_probe_frontend: Device with no driver: device/vbd/51728
[2024-10-07 20:52:09] [    0.598400] xenbus_probe_frontend: Device with no driver: device/vbd/51744
[2024-10-07 20:52:09] [    0.598414] xenbus_probe_frontend: Device with no driver: device/vbd/51760
[2024-10-07 20:52:09] [    0.598662] hid_bpf: error while preloading HID BPF dispatcher: -22
[2024-10-07 20:52:09] [    0.598669] RAS: Correctable Errors collector initialized.
[2024-10-07 20:52:09] [    0.598719] clk: Disabling unused clocks
[2024-10-07 20:52:09] [    0.600247] Freeing unused decrypted memory: 2028K
[2024-10-07 20:52:09] [    0.601723] Freeing unused kernel image (initmem) memory: 5140K
[2024-10-07 20:52:09] [    0.601741] Write protecting the kernel read-only data: 28672k
[2024-10-07 20:52:09] [    0.602344] Freeing unused kernel image (rodata/data gap) memory: 1420K
[2024-10-07 20:52:09] [    0.602365] Run /init as init process
[2024-10-07 20:52:09] Qubes initramfs script here:
[2024-10-07 20:52:09] [    0.605546] Invalid max_queues (4), will use default max: 2.
[2024-10-07 20:52:09] [    0.611318] blkfront: xvda: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled
[2024-10-07 20:52:09] [    0.614226] blkfront: xvdb: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled
[2024-10-07 20:52:09] [    0.616127] blkfront: xvdc: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled
[2024-10-07 20:52:09] [    0.618229] blkfront: xvdd: barrier or flush: disabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled
[2024-10-07 20:52:09] Waiting for /dev/xvda* devices...
[2024-10-07 20:52:09] Qubes: Doing R/W setup for TemplateVM...
[2024-10-07 20:52:09] [    0.987498]  xvdc: xvdc1 xvdc3
[2024-10-07 20:52:09] Setting up swapspace version 1, size = 1073737728 bytes
[2024-10-07 20:52:09] UUID=4f21df50-5882-4562-b052-0967a24220b0
[2024-10-07 20:52:09] Qubes: done.
[2024-10-07 20:52:09] mount: mounting /dev/mapper/dmroot on /sysroot failed: Invalid argument
[2024-10-07 20:52:09] Waiting for /dev/xvdd device...
[2024-10-07 20:52:09] [    1.017700] EXT4-fs (xvdd): mounting ext3 file system using the ext4 subsystem
[2024-10-07 20:52:09] [    1.025302] EXT4-fs (xvdd): mounted filesystem 212ddd52-f4ea-4d93-9ad8-d88854e31cc4 ro with ordered data mode. Quota mode: none.
[2024-10-07 20:52:09] mount: mounting none on /sysroot/lib/modules failed: No such file or directory
[2024-10-07 20:52:09] [    1.034338] EXT4-fs (xvdd): unmounting filesystem 212ddd52-f4ea-4d93-9ad8-d88854e31cc4.
[2024-10-07 20:52:09] mount: can't read '/proc/mounts': No such file or directory
[2024-10-07 20:52:09] BusyBox v1.36.0 (2023-01-10 00:00:00 UTC) multi-call binary.
[2024-10-07 20:52:09] 
[2024-10-07 20:52:09] Usage: switch_root [-c CONSOLE_DEV] NEW_ROOT NEW_INIT [ARGS]
[2024-10-07 20:52:09] 
[2024-10-07 20:52:09] Free initramfs and switch to another root fs:
[2024-10-07 20:52:09] chroot to NEW_ROOT, delete all in /, move NEW_ROOT to /,
[2024-10-07 20:52:09] execute NEW_INIT. PID must be 1. NEW_ROOT must be a mountpoint.
[2024-10-07 20:52:09] 
[2024-10-07 20:52:09]   -c DEV  Reopen stdio to DEV after switch
[2024-10-07 20:52:09] [    1.042803] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
[2024-10-07 20:52:09] [    1.042829] CPU: 0 PID: 1 Comm: switch_root Not tainted 6.6.48-1.qubes.fc37.x86_64 #1
[2024-10-07 20:52:09] [    1.042856] Call Trace:
[2024-10-07 20:52:09] [    1.042871]  <TASK>
[2024-10-07 20:52:09] [    1.042882]  dump_stack_lvl+0x47/0x60
[2024-10-07 20:52:09] [    1.042902]  panic+0x345/0x360
[2024-10-07 20:52:09] [    1.042921]  do_exit+0x4ea/0x510
[2024-10-07 20:52:09] [    1.042937]  ? vfs_write+0x23b/0x420
[2024-10-07 20:52:09] [    1.042955]  do_group_exit+0x31/0x80
[2024-10-07 20:52:09] [    1.042970]  __x64_sys_exit_group+0x18/0x20
[2024-10-07 20:52:09] [    1.042985]  do_syscall_64+0x59/0x80
[2024-10-07 20:52:09] [    1.043002]  ? ksys_write+0x6f/0xf0
[2024-10-07 20:52:09] [    1.043017]  ? syscall_exit_to_user_mode+0x22/0x40
[2024-10-07 20:52:09] [    1.043038]  ? do_syscall_64+0x65/0x80
[2024-10-07 20:52:09] [    1.043053]  ? xen_sched_clock+0x15/0x30
[2024-10-07 20:52:09] [    1.043068]  ? sched_clock+0x10/0x30
[2024-10-07 20:52:09] [    1.043084]  ? sched_clock_cpu+0xf/0x190
[2024-10-07 20:52:09] [    1.043100]  ? irqtime_account_irq+0x40/0xc0
[2024-10-07 20:52:09] [    1.043119]  ? __irq_exit_rcu+0x4b/0xd0
[2024-10-07 20:52:09] [    1.043135]  ? sysvec_xen_hvm_callback+0x3e/0x90
[2024-10-07 20:52:09] [    1.043154]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[2024-10-07 20:52:09] [    1.043174] RIP: 0033:0x423f5e
[2024-10-07 20:52:09] [    1.043189] Code: c3 12 00 48 8b 35 6a c3 12 00 48 8b 05 7b c3 12 00 e9 7c fe ff ff 66 0f 1f 44 00 00 f3 0f 1e fa 48 63 ff b8 e7 00 00 00 0f 05 <ba> 3c 00 00 00 0f 1f 44 00 00 48 89 d0 0f 05 eb f9 90 41 56 41 55

...

marmarek commented 2 weeks ago

[2024-10-07 20:52:09] mount: mounting /dev/mapper/dmroot on /sysroot failed: Invalid argument

Interesting, this looks like the template root filesystem is broken/empty. But on lvs I see:

vm-fedora-40-xfce-root qubes_dom0 Vwi-a-tz-- 20.00g vm-pool vm-fedora-40-xfce-root-1728327631-back 25.64

so, it isn't empty. Could it be maybe https://github.com/QubesOS/qubes-issues/issues/4974 ? What fdisk -l on the dom0 disk says regarding sector size?

Oct 07 20:42:47 localhost kernel: pci 10000:e0:06.0: Failed to add - passthrough or MSI/MSI-X might fail!

The message about MSI/MSI-X is not the interesting part. The device BDF is - 0x10000 is not a valid PCI domain. Is it maybe some built-in Intel RAID (more details) ? It isn't necessarily the source of the problems, but I'd still recommend disabling it in the firmware setup menu.

sjvudp commented 1 week ago

[2024-10-07 20:52:09] mount: mounting /dev/mapper/dmroot on /sysroot failed: Invalid argument

What fdisk -l on the dom0 disk says regarding sector size?

First smartctl outputs: ...

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

...

Fdisk reports for sda (NVME):

Disk /dev/sda: 238.47 GiB, 256060514304 bytes, 500118192 sectors
Disk model: 6GB SSD         
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 9CB9F2ED-E3FF-46F5-AFE9-C126267C20F6

Device         Start       End   Sectors  Size Type
/dev/sda1       2048   1050623   1048576  512M EFI System
/dev/sda2    1050624   3147775   2097152    1G Linux filesystem
/dev/sda3    3147776 338692095 335544320  160G Linux reserved
/dev/sda4  338692096 500117503 161425408   77G Linux reserved

And for the Fedora LV:

GPT PMBR size mismatch (41943039 != 5242879) will be corrected by write.
Disk /dev/qubes_dom0/vm-fedora-40-xfce-root: 20 GiB, 21474836480 bytes, 5242880 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 131072 bytes / 131072 bytes
Disklabel type: dos
Disk identifier: 0x00000000

Device                                  Boot Start     End Sectors Size Id Type
/dev/qubes_dom0/vm-fedora-40-xfce-root1          1 5242879 5242879  20G ee GPT

Partition 1 does not start on physical sector boundary.

Besides the smartctl output the above is from Dom0, and here is what the openSUSE kernel (6.11.1-lp155.2.g3bf25fe-default) says for sda:

Oct 10 21:21:13 localhost kernel: sd 0:0:0:0: [sda] Spinning up disk...
Oct 10 21:21:14 localhost kernel: sd 0:0:0:0: [sda] 500118192 512-byte logical blocks: (256 GB/238 GiB)
Oct 10 21:21:14 localhost kernel: sd 0:0:0:0: [sda] 4096-byte physical blocks
Oct 10 21:21:14 localhost kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 10 21:21:14 localhost kernel: sd 0:0:0:0: [sda] Mode Sense: 5f 00 00 08
Oct 10 21:21:14 localhost kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 10 21:21:14 localhost kernel: sd 0:0:0:0: [sda] Preferred minimum I/O size 4096 bytes
Oct 10 21:21:14 localhost kernel: sd 0:0:0:0: [sda] Optimal transfer size 33553920 bytes not a multiple of preferred minimum block size (4096 bytes)

And just in case: The NVME module is in an Icy Box "IB-1817Ma-C31" using a Jmicron USB bridge (ID 152d:0562).

marmarek commented 1 week ago

GPT PMBR size mismatch (41943039 != 5242879) will be corrected by write. Disk /dev/qubes_dom0/vm-fedora-40-xfce-root: 20 GiB, 21474836480 bytes, 5242880 sectors Units: sectors of 1 * 4096 = 4096 bytes Sector size (logical/physical): 4096 bytes / 4096 bytes

Yes, this looks exactly like https://github.com/QubesOS/qubes-issues/issues/4974

sjvudp commented 1 week ago

From the "distributor from hell" point of view: Let the user install the system for an hour or so, and when it's done eventually, just don't work. We fooled the user!

From my point of view: If it's known that 4kB block sizes do not work, why can't the system say so before installing? An even if it can't, why can't it say when trying to start a VM (qube)?

The state as it is now is most frustrating (it fails with no obvious reason for most of the prospective users).

What also puzzles me: I had installed Qubes OS 4.0 (AFAIR) on the very same device, but using MBR partitioning at that time. For some reasons that worked.

Most of all: Is there a way to fix the system without reinstalling? Can the external NVME be used at all with GPT?