QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
532 stars 46 forks source link

debian-12 fails to start with in-VM kernel #8505

Closed 3hhh closed 4 months ago

3hhh commented 1 year ago

How to file a helpful issue

Qubes OS release

4.1

Brief summary

[2023-09-10 11:21:33] .[30m.[47mWelcome to GRUB!
[2023-09-10 11:21:33] 
[2023-09-10 11:21:33] .[37m.[40m.[37m.[40m.[37m.[40m.[3;34H      [ grub-xen.cfg  424B  100%  11.50KiB/s ].[3;1Herror: no such device: /boot/xen/pvboot-x86_64.elf.
[2023-09-10 11:21:34] Reading (xen/xvda,gpt3/boot/grub/grub.cfg
[2023-09-10 11:21:34] .[H.[J.[1;1Herror: file `/boot/grub/fonts/unicode.pf2' not found.
[2023-09-10 11:21:34] error: no suitable video mode found.
[2023-09-10 11:21:34] error: no video mode activated.
[2023-09-10 11:21:34] .[4;34H      [ grub.cfg  15.44KiB  100%  19.21KiB/s ].[4;1H.[H.[J.[1;1H  Booting `Debian GNU/Linux'
[2023-09-10 11:21:34] 
[2023-09-10 11:21:34] Loading Li
pci_unplug: Xen Platform PCI: unrecognised magic value
[2023-09-10 11:21:36] [    0.234185] ACPI: No IOAPIC entries present
[2023-09-10 11:21:36] [    0.295817] PCI: Fatal: No config space access function found
[2023-09-10 11:21:36] [    0.326782] ACPI: OSL: SCI (ACPI GSI 9) not registered
[2023-09-10 11:21:36] [    0.328729] ACPI Error: No handler or method for GPE 00, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:36] [    0.328771] ACPI Error: No handler or method for GPE 01, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:36] [    0.328818] ACPI Error: No handler or method for GPE 03, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:36] [    0.328853] ACPI Error: No handler or method for GPE 04, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:36] [    0.328889] ACPI Error: No handler or method for GPE 05, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:36] [    0.328924] ACPI Error: No handler or method for GPE 06, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:36] [    0.328959] ACPI Error: No handler or method for GPE 07, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:42] .[2J.[3J.[-1;-1fSetting up swapspace version 1, size = 1073737728 bytes
[2023-09-10 11:21:44] UUID=307ee45f-d167-4674-bcbb-2a6eb51098bf
[2023-09-10 11:21:44] /dev/xvda3: clean, 195540/643376 files, 1643068/2569216 blocks
[2023-09-10 11:21:44] mount: mounting /dev/mapper/dmroot on /root failed: No such device
[2023-09-10 11:21:45] Failed to mount /dev/mapper/dmroot as root file system.
[2023-09-10 11:21:45] 
[2023-09-10 11:21:45] 
[2023-09-10 11:21:45] BusyBox v1.35.0 (Debian 1:1.35.0-4+b3) built-in shell (ash)
[2023-09-10 11:21:45] Enter 'help' for a list of built-in commands.

Steps to reproduce

  1. Install debian-12 template via qvm-template.
  2. Switch kernel to pvgrub2-pvh.
  3. Start the template.

Expected behavior

Starts.

Actual behavior

Fails to start.

Notes

My old debian-11 template works just fine that way.

andrewdavidwong commented 1 year ago

Might be related to #8493.

marmarek commented 1 year ago

This should be already fixed by https://github.com/QubesOS/qubes-builder-debian/commit/83fbd33187783994bf3ea982d96fecebc87af188. Which template version do you have, and can you check if you have /etc/initramfs-tools/conf.d/99-template-build.conf file there? If you have it, remove it and regenerate initramfs (update-initramfs).

3hhh commented 1 year ago

On 9/10/23 14:03, Marek Marczykowski-Górecki wrote:

This should be already fixed by https://github.com/QubesOS/qubes-builder-debian/commit/83fbd33187783994bf3ea982d96fecebc87af188. Which template version do you have, and can you check if you have /etc/initramfs-tools/conf.d/99-template-build.conf file there? If you have it, remove it and regenerate initramfs (update-initramfs).

Yes, I had that, updated initramfs via update-initramfs -u, but unfortunately ran into the very same issue afterwards.

marmarek commented 1 year ago

You need to update the initramfs for the kernel version in /boot, not currently running one (from dom0). So, add -k all or something like this.

3hhh commented 1 year ago

On 9/10/23 17:00, Marek Marczykowski-Górecki wrote:

You need to update the initramfs for the kernel version in /boot, not currently running one (from dom0). So, add -k all or something like this.

-k all indeed refreshes all three available initramfs versions and not just the newest one. However it doesn't fix the issue. The error message remains exactly the same as initially reported.

3hhh commented 11 months ago

Still the same with current updates.

I guess the relevant error is error: no such device: /boot/xen/pvboot-x86_64.elf. There's indeed no such file, but neither in debian-11 and that works just fine.

grnklod commented 11 months ago

You can try this in debian-12 template:

sudo apt --reinstall install linux-image*
sudo apt install grub2 qubes-kernel-vm-support
sudo grub-install /dev/xvda
sudo update-grub

https://forum.qubes-os.org/t/cannot-boot-to-native-fedora-37-minimal-kernel/15761/6

3hhh commented 11 months ago

No, that unfortunately doesn't help.

Did you get it working?

Maybe I'll try to upgrade the working debian-11 template.

grnklod commented 11 months ago

Maybe you have this issue: https://github.com/QubesOS/qubes-issues/issues/4974 https://github.com/QubesOS/qubes-issues/issues/8465

marmarek commented 11 months ago

Try removing "quiet" option from /etc/default/grub (and regenerate grub config after that), hopefully you'll get more details then

3hhh commented 11 months ago

That indeed gets me some more logs, but I don't see anything obvious nonetheless:

[2023-10-18 23:02:32] Logfile Opened [2023-10-18 23:02:32] .[30m.[47mWelcome to GRUB! [2023-10-18 23:02:32] [2023-10-18 23:02:32] .[37m.[40m.[37m.[40m.[37m.[40m.[3;34H [ grub-xen.cfg 424B 100% 10.35KiB/s ].[3;1Herror: no such device: /boot/xen/pvboot-x86_64.elf [2023-10-18 23:02:37] [ 0.000000] platform_pci_unplug: Xen Platform PCI: unrecognised magic value [2023-10-18 23:02:37] [ 0.208999] tsc: Fast TSC calibration failed [2023-10-18 23:02:37] [ 0.209004] tsc: Detected 2594.106 MHz processor [2023-10-18 23:02:37] [ 0.209276] last_pfn = 0xfa000 max_arch_pfn = 0x400000000 [2023-10-18 23:02:37] [ 0.209353] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT [2023-10-18 23:02:37] [ 0.227857] RAMDISK: [mem 0x2e701000-0x33377fff] [2023-10-18 23:02:37] [ 0.227877] ACPI: Early table checksum verification disabled [2023-10-18 23:02:37] [ 0.227890] ACPI: RSDP 0x00000000FC008000 000024 (v02 Xen ) [2023-10-18 23:02:37] [ 0.227895] ACPI: XSDT 0x00000000FC007F60 000034 (v01 Xen HVM 00000000 HVML 00000000) [2023-10-18 23:02:37] [ 0.227902] ACPI: FACP 0x00000000FC007D60 00010C (v05 Xen HVM 00000000 HVML 00000000) [2023-10-18 23:02:37] [ 0.227908] ACPI: DSDT 0x00000000FC001040 006C9B (v05 Xen HVM 00000000 INTL 20190509) [2023-10-18 23:02:37] [ 0.227912] ACPI: FACS 0x00000000FC001000 000040 [2023-10-18 23:02:37] [ 0.227915] ACPI: FACS 0x00000000FC001000 000040 [2023-10-18 23:02:37] [ 0.227919] ACPI: APIC 0x00000000FC007E70 00003C (v02 Xen HVM 00000000 HVML 00000000) [2023-10-18 23:02:37] [ 0.227921] ACPI: Reserving FACP table memory at [mem 0xfc007d60-0xfc007e6b] [2023-10-18 23:02:37] [ 0.227923] ACPI: Reserving DSDT table memory at [mem 0xfc001040-0xfc007cda] [2023-10-18 23:02:37] [ 0.227924] ACPI: Reserving FACS table memory at [mem 0xfc001000-0xfc00103f] [2023-10-18 23:02:37] [ 0.227924] ACPI: Reserving FACS table memory at [mem 0xfc001000-0xfc00103f] [2023-10-18 23:02:37] [ 0.227925] ACPI: Reserving APIC table memory at [mem 0xfc007e70-0xfc007eab] [2023-10-18 23:02:37] [ 0.228781] No NUMA configuration found [2023-10-18 23:02:37] [ 0.228782] Faking a node at [mem 0x0000000000000000-0x00000000f9ffffff] [2023-10-18 23:02:37] [ 0.228791] NODE_DATA(0) allocated [mem 0xf9fd3000-0xf9ffdfff] [2023-10-18 23:02:37] [ 0.229090] Zone ranges: [2023-10-18 23:02:37] [ 0.229095] DMA [mem 0x0000000000001000-0x0000000000ffffff] [2023-10-18 23:02:37] [2023-10-18 23:02:37] [ 0.230258] Booting paravirtualized kernel on Xen PVH [2023-10-18 23:02:37] [ 0.230330] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns [2023-10-18 23:02:37] [ 0.239752] setup_percpu: NR_CPUS:8192 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1 [2023-10-18 23:02:37] [ 0.241571] percpu: Embedded 61 pages/cpu s212992 r8192 d28672 u1048576 [2023-10-18 23:02:37] [ 0.241619] PV qspinlock hash table entries: 256 (order: 0, 4096 bytes, linear) [2023-10-18 23:02:37] [ 0.241638] Fallback order for Node 0: 0 [2023-10-18 23:02:37] [ 0.241642] Built 1 zonelists, mobility grouping on. Total pages: 1007744 [2023-10-18 23:02:37] [ 0.241643] Policy zone: DMA32 [2023-10-18 23:02:37] [ 0.241645] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.1.0-13-amd64 root=/dev/mapper/dmroot ro xen_scrub_pages=0 root=/dev/mapper/dmroot console=tty0 console=hvc0 swiotlb=8192 noresume [2023-10-18 23:02:37] [ 0.241724] Unknown kernel command line parameters "BOOT_IMAGE=/boot/vmlinuz-6.1.0-13-amd64", will be passed to user space. [2023-10-18 23:02:37] [ 0.241906] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes, linear) [2023-10-18 23:02:37] [ 0.241998] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes, linear) [2023-10-18 23:02:37] [ 0.242080] mem auto-init: stack:all(zero), heap alloc:on, heap free:off [2023-10-18 23:02:37] [ 0.246618] Memory: 260860K/4095612K available (14342K kernel code, 2329K rwdata, 9132K rodata, 2772K init, 17416K bss, 204668K reserved, 0K cma-reserved) [2023-10-18 23:02:37] [ 0.247360] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1 [2023-10-18 23:02:37] [ 0.247415] Kernel/User page tables isolation: enabled [2023-10-18 23:02:37] [ 0.247518] ftrace: allocating 40153 entries in 157 pages [2023-10-18 23:02:37] [ 0.254896] ftrace: allocated 157 pages with 5 groups [2023-10-18 23:02:37] [ 0.255495] Dynamic Preempt: voluntary [2023-10-18 23:02:37] [ 0.255531] rcu: Preemptible hierarchical RCU implementation. [2023-10-18 23:02:37] [ 0.255537] rcu: RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=2. [2023-10-18 23:02:37] [ 0.255538] Trampoline variant of Tasks RCU enabled. [2023-10-18 23:02:37] [ 0.255539] Rude variant of Tasks RCU enabled. [2023-10-18 23:02:37] [ 0.255539] Tracing variant of Tasks RCU enabled. [2023-10-18 23:02:37] [ 0.255544] rcu: RCU calculated value of scheduler-enlistment on [2023-10-18 23:02:37] [ 0.264436] Spectre V2 : Mitigation: Retpolines [2023-10-18 23:02:37] [ 0.264451] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch [2023-10-18 23:02:37] [ 0.264472] Spectre V2 : Spectre v2 / SpectreRSB : Filling RSB on VMEXIT [2023-10-18 23:02:37] [ 0.264490] Spectre V2 : Enabling Restricted Speculation for firmware calls [2023-10-18 23:02:37] [ 0.264509] Spectre V2 : mitigation: Enabling conditional Indirect Branch Prediction Barrier [2023-10-18 23:02:37] [ 0.264535] Spectre V2 : User space: Mitigation: STIBP via prctl [2023-10-18 23:02:37] [ 0.264553] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl [2023-10-18 23:02:37] [ 0.264580] MDS: Mitigation: Clear CPU buffers [2023-10-18 23:02:37] [ 0.264595] MMIO Stale Data: Unknown: No mitigations [2023-10-18 23:02:37] [ 0.264630] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [2023-10-18 23:02:37] [ 0.264653] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [2023-10-18 23:02:37] [ 0.264672] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'

[2023-10-18 23:02:37] [ 0.264709] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [2023-10-18 23:02:37] [ 0.268176] Freeing SMP alternatives memory: 36K [2023-10-18 23:02:37] [ 0.268176] pid_max: default: 32768 minimum: 301 [2023-10-18 23:02:37] [ 0.268176] LSM: Security Framework initializing [2023-10-18 23:02:37] [ 0.268176] landlock: Up and running. [2023-10-18 23:02:37] [ 0.268176] Yama: disabled by default; enable with sysctl kernel.yama.* [2023-10-18 23:02:37] [ 0.268176] AppArmor: AppArmor initialized [2023-10-18 23:02:37] [ 0.268176] TOMOYO Linux initialized [2023-10-18 23:02:37] [ 0.268176] LSM support for eBPF active [2023-10-18 23:02:37] [ 0.268176] Mount-cache hash table entries: 8192 (order: 4, 65536 bytes, linear) [2023-10-18 23:02:37] [ 0.268176] Mountpoint-cache hash table entries: 8192 (order: 4, 65536 bytes, linear) [2023-10-18 23:02:37] [ 0.268176] clocksource: xen: mask: 0xfffffff [2023-10-18 23:02:38] [ 0.268338] smp: Brought up 1 node, 2 CPUs [2023-10-18 23:02:38] [ 0.268338] smpboot: Max logical packages: 1 [2023-10-18 23:02:38] [ 0.268338] smpboot: Total of 2 processors activated (10376.42 BogoMIPS) [2023-10-18 23:02:38] [ 0.281214] node 0 deferred pages initialised in 12ms [2023-10-18 23:02:38] [ 0.281226] devtmpfs: initialized [2023-10-18 23:02:38] [ 0.281226] x86/mm: Memory block size: 128MB [2023-10-18 23:02:38] [ 0.281226] memmap_init_zone_device initialised 32768 pages in 0ms [2023-10-18 23:02:38] [ 0.281226] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns [2023-10-18 23:02:38] [ 0.281226] futex hash table entries: 512 (order: 3, 32768 bytes, linear) [2023-10-18 23:02:38] [ 0.281226] pinctrl core: initialized pinctrl subsystem [2023-10-18 23:02:38] [ 0.287287] NET: Registered PF_NETLINK/PF_ROUTE protocol family [2023-10-18 23:02:38] [ 0.287335] xen:grant_table: Grant tables using version 1 layout [2023-10-18 23:02:38] [ 0.287501] Grant table initialized [2023-10-18 23:02:38] [ 0.288341] DMA: preallocated 512 KiB GFP_KERNEL pool for atomic allocations [2023-10-18 23:02:38] [ 0.288475] DMA: preallocated 512 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations [2023-10-18 23:02:38] [ 0.289244] DMA: preallocated 512 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations [2023-10-18 23:02:38] [ 0.289307] audit: initializing netlink subsys (disabled) [2023-10-18 23:02:38] [ 0.289419] audit: type=2000 audit(1697662957.850:1): state=initialized audit_enabled=0 res=1 [2023-10-18 23:02:38] [ 0.289419] thermal_sys: Registered thermal governor 'fair_share' [2023-10-18 23:02:38] [ 0.292206] thermal_sys: Registered thermal governor 'bang_bang' [2023-10-18 23:02:38] [ 0.292228] thermal_sys: Registered thermal governor 'step_wise' [2023-10-18 23:02:38] [ 0.292247] thermal_sys: Registered thermal governor 'user_space' [2023-10-18 23:02:38] [ 0.292265] thermal_sys: Registered thermal governor 'power_allocator' [2023-10-18 23:02:38] [ 0.292307] cpuidle: using governor ladder [2023-10-18 23:02:38] [ 0.292341] cpuidle: using governor menu [2023-10-18 23:02:38] [ 0.292497] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 [202 9) [2023-10-18 23:02:38] [ 0.325306] ACPI Error: No handler or method for GPE 06, disabling event (20220331/evgpe-839) [2023-10-18 23:02:38] [ 0.325341] ACPI Error: No handler or method for GPE 07, disabling event (20220331/evgpe-839) [2023-10-18 23:02:38] [ 0.339373] xen:balloon: Initialising balloon driver [2023-10-18 23:02:38] [ 0.339451] iommu: Default domain type: Translated [2023-10-18 23:02:38] [ 0.339451] iommu: DMA domain TLB invalidation policy: lazy mode [2023-10-18 23:02:38] [ 0.339451] pps_core: LinuxPPS API ver. 1 registered [2023-10-18 23:02:38] [ 0.339451] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti @.***> [2023-10-18 23:02:38] [ 0.339451] PTP clock support registered [2023-10-18 23:02:38] [ 0.339451] EDAC MC: Ver: 3.0.0 [2023-10-18 23:02:38] [ 0.340370] NetLabel: Initializing [2023-10-18 23:02:38] [ 0.340386] NetLabel: domain hash size = 128 [2023-10-18 23:02:38] [ 0.340401] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO [2023-10-18 23:02:38] [ 0.340448] NetLabel: unlabeled traffic allowed by default [2023-10-18 23:02:38] [ 0.340464] PCI: Using ACPI for IRQ routing [2023-10-18 23:02:38] [ 0.340477] PCI: System does not support PCI [2023-10-18 23:02:38] [ 0.340509] vgaarb: loaded [2023-10-18 23:02:38] [ 0.341470] clocksource: Switched to clocksource xen [2023-10-18 23:02:38] [ 0.359099] VFS: Disk quotas dquot_6.6.0 [2023-10-18 23:02:38] [ 0.359149] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [2023-10-18 23:02:38] [ 0.359430] AppArmor: AppArmor Filesystem Enabled [2023-10-18 23:02:38] [ 0.359466] pnp: PnP ACPI init [2023-10-18 23:02:38] [ 0.359530] pnp: PnP ACPI: found 0 devices [2023-10-18 23:02:38] [ 0.365354] NET: Registered PF_INET protocol family [2023-10-18 23:02:38] [ 0.365619] IP idents hash table entries: 65536 (order: 7, 524288 bytes, linear) [2023-10-18 23:02:38] [ 0.376748] tcp_listen_portaddr_hash hash table entries: 2048 (order: 3, 32768 bytes, linear) [2023-10-18 23:02:38] [ 0.376796] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes, linear) [2023-10-18 23:02:38] [ 0.376833] TCP established hash table entries: 32768 (order: 6, 262144 bytes, linear) [2023-10-18 23:02:38] [ 0.376953] TCP bind hash table entries: 32768 (order: 8, 1048576 bytes, linear) [2023-10-18 23:02:38] [ 0.377082] TCP: Hash tables configured (established 32768 bind 32768) [2023-10-18 23:02:38] [ 0.377185] MPTCP token hash table entries: 4096 (order: 4, 98304 bytes, linear) [2023-10-18 23:02:38] [ 0.377238] UDP hash table entries: 2048 (order: 4, 65536 bytes, linear) [2023-10-18 23:02:38] [ 0.377269] UDP-Lite hash table entries: 2048 (order: 4, 65536 bytes, linear) [2023-10-18 23:02:38] [ 0.377343] NET: Registered PF_UNIX/PF_LOCAL protocol family [2023-10-18 23:02:38] [ 0.377370] NET: Registered PF_XDP protocol family [2023-10-18 23:02:38] [ 0.377387] PCI: CLS 0 bytes, default 64 [2023-10-18 23:02:38] [ 0.377449] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x25647bfab01, max_idle_ns: 440795211785 ns [2023-10-18 23:02:38] [ 0.377700] Trying to unpack rootfs image as initramfs... [2023-10-18 23:02:38] [ 0.381096] Initialise system trusted keyrings [2023-10-18 23:02:38] [ 0.381138] Key type blacklist registered [2023-10-18 23:02:38] [ 0.381231] workingset: timestamp_bits=36 max_order=20 bucket_order=0 [2023-10-18 23:02:38] [ 0.383672] zbud: loaded [2023-10-18 23:02:38] [ 0.385650] integrity: Platform Keyring initialized [2023-10-18 23:02:38] [ 0.385678] integrity: Machine keyring initialized [2023-10-18 23:02:38] [ 0.385698] Key type asymmetric registered [2023-10-18 23:02:38] [ 0.385714] Asymmetric key parser 'x509' registered

(it ends there)

marmarek commented 11 months ago

Can you check xl dmesg (or /var/log/xen/console/hypervisor.log) about that time? Maybe the VM was killed by Xen for some reason.

3hhh commented 11 months ago

Looks like you got me on the right lead:

(XEN) Domain 27 (vcpu#1) crashed on cpu#2:
(XEN) ----[ Xen-4.14.6  x86_64  debug=n   Not tainted ]----
(XEN) CPU:    2
(XEN) RIP:    0010:[<ffffffff95bd88b7>]
(XEN) RFLAGS: 0000000000010246   CONTEXT: hvm guest (d27v1)
(XEN) rax: 0000000000000000   rbx: 0000000000000001   rcx: 0000000000001000
(XEN) rdx: ffffda63c03fa340   rsi: ffffda63c03fa380   rdi: ffff89394fe8d000
(XEN) rbp: ffff9c38400e39b8   rsp: ffff9c38400e38b0   r8:  0000000000000000
(XEN) r9:  00000000000dd61d   r10: ffff893a35d37740   r11: 0000000000000018
(XEN) r12: 0000000000000000   r13: ffff893a35d37740   r14: ffff893a39fd3600
(XEN) r15: 0000000000000282   cr0: 0000000080050033   cr4: 00000000001706e0
(XEN) cr3: 000000006a010001   cr2: 0000000000000000
(XEN) fsb: 0000000000000000   gsb: ffff893a35d00000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: 0010
(XEN) p2m_pod_demand_populate: Dom27 out of PoD memory! (tot=102416 ents=921600 dom27)
(XEN) domain_crash called from p2m-pod.c:1254
(XEN) p2m_pod_demand_populate: Dom27 out of PoD memory! (tot=102416 ents=921600 dom27)
(XEN) domain_crash called from p2m-pod.c:1254
(XEN) p2m_pod_demand_populate: Dom27 out of PoD memory! (tot=102416 ents=921600 dom27)
(XEN) domain_crash called from p2m-pod.c:1254

So this is an instance of #7023.

The debian-12 VM was configured to use the default 400MB - 4GB memory balancing & 2 vcpus. In-VM kernel is 6.1.0.13.

3hhh commented 11 months ago

Starting it with 1GB fixed RAM works.

3hhh commented 11 months ago

So a debian upstream issue apparently...

adrelanos commented 10 months ago

(XEN) p2m_pod_demand_populate: Dom27 out of PoD memory! (tot=102416 ents=921600 dom27)

I had a similar issue and could fix it by increasing the initial memory setting in Qubes VM Manager. See: https://github.com/QubesOS/qubes-issues/issues/8649

grnklod commented 10 months ago

I've stumbled upon this myself and traced this issue to some problem with debian-12 template and max memory value. If I create qube based on debian-12 template with pvgrub2-pvh kernel, PVH mode, enabled memory balancing and max memory set to 3069-4031 MB then qube fail to start. If I set max memory to any other value then it works. If I change qube template to debian-12-xfce then it works with any max memory value.

3hhh commented 9 months ago

(XEN) p2m_pod_demand_populate: Dom27 out of PoD memory! (tot=102416 ents=921600 dom27)

I had a similar issue and could fix it by increasing the initial memory setting in Qubes VM Manager. See: #8649

Yes, that works, too.

The bug still exists with the newest debian-12 kernel btw.

I also wonder why debian-12 has so much higher memory requirements than debian-11. More than 100MB memory footprint difference per VM aren't nice when it comes to 20-50 VMs.

marmarek commented 4 months ago

With memory hotplug in R4.2 the populate-on-demand (PoD) is not used anymore, so the crash on start (the way it did here) due to too little memory shouldn't happen anymore. The feature isn't in R4.1, but since support for R4.1 ends soon and there is a simple workaround, I don't think it's worth fixing it in other way.