oxidecomputer / propolis

VMM userspace for illumos bhyve
Mozilla Public License 2.0
178 stars 22 forks source link

helios guest panicked booting from a crucible-backed nvme disk #300

Closed jordanhendricks closed 1 year ago

jordanhendricks commented 1 year ago

I attempted to boot a helios guest from a crucible boot disk using the nvme driver, and both propolis and helios guest panicked. The guest panic looked very similar to #207.

I didn't debug this too much, but both panics went away when I switched the driver to be virtio instead of nvme; so if I were debugging this I might start with the nvme emulation.

Capturing some notes about what I saw below.

Propolis panic

The panic was an unhandled VM exit for instruction emulation:

Jan 10 06:21:06.283 INFO wrmsr, value: 70368744177664, msr: 3221291039, vcpu: 0, component: vcpu_tasks
thread 'vcpu-1' panicked at 'vCPU 1: Unhandled VM exit: InstEmul(InstEmul { inst_data: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], len: 0 })', bin/propolis-server/src/lib/vm/mod.rs:450:9

This was the state of vcpu 1:

bhyvectl output ``` $ pfexec bhyvectl --cpu=1 --get-all --vm=acc93371-742f-4f56-878e-f2ce202abcd7 > vcpu1.out ID Length Name 0 1024MB sysmem 1 2048KB bootrom Address Length Segment Offset Prot Flags 0 1024MB sysmem 0 RWX FFE00000 2048KB bootrom 0 R-X efer[1] 0x0000000000001800 cr0[1] 0x0000000000010033 cr2[1] 0x0000000000000000 cr3[1] 0x000000003fc01000 cr4[1] 0x0000000000000640 dr0[1] 0x0000000000000000 dr1[1] 0x0000000000000000 dr2[1] 0x0000000000000000 dr3[1] 0x0000000000000000 dr6[1] 0x00000000ffff0ff0 dr7[1] 0x0000000000000400 rsp[1] 0x000000003fb6df5e rip[1] 0x000000000000e01a rax[1] 0x0000000000002186 rbx[1] 0x0000000000000000 rcx[1] 0x0000000000000000 rdx[1] 0x000000000000fffe rsi[1] 0x000000003fb6e01a rdi[1] 0x000000003f4a56e8 rbp[1] 0x000000003f4a4004 r8[1] 0x0000000000000010 r9[1] 0x000000003fb6df80 r10[1] 0x0000000000000000 r11[1] 0x0000000000000000 r12[1] 0x0000000000000000 r13[1] 0x0000000000000001 r14[1] 0x0000000000000001 r15[1] 0x000000003ffc4f78 rflags[1] 0x0000000000000092 ds desc[1] 0x0000000000000000/0xffffffff/0x0000c093 es desc[1] 0x0000000000000000/0x00000000/0x00010000 fs desc[1] 0x0000000000000000/0xffffffff/0x0000c093 gs desc[1] 0x0000000000000000/0xffffffff/0x0000c093 ss desc[1] 0x0000000000000000/0xffffffff/0x0000c093 cs desc[1] 0x0000000000000000/0xffffffff/0x0000209b tr desc[1] 0x0000000000000000/0x0000ffff/0x0000008b ldtr desc[1] 0x0000000000000000/0x0000ffff/0x00000082 gdtr[1] 0x000000003f9eea98/0x00000047 idtr[1] 0x000000003f4a4250/0x00000fff cs[1] 0x0038 ds[1] 0x0030 es[1] 0x0000 fs[1] 0x0030 gs[1] 0x0030 ss[1] 0x0030 tr[1] 0x0000 ldtr[1] 0x0000 fpu_fcw[1] 0x037f fpu_fsw[1] 0x2802 fpu_ftw[1] 0x00e0 fpu_fop[1] 0x0000 fpu_rip[1] 0x0000000000000000 fpu_rdp[1] 0x0000000000000000 fpu_mxcsr[1] 0x00001f80 fpu_mxcsr_mask[1] 0x0002ffff fpu_st1[0] 0x0000000087fc00000000400c00000000 fpu_st1[1] 0x00000000a60000000000400600000000 fpu_st1[2] 0x00530000bf72000000003bfc00000000 fpu_st1[3] 0x00000000000000000000000000000000 fpu_st1[4] 0x00000000000000000000000000000000 fpu_st1[5] 0x00000000000000000000000000000000 fpu_st1[6] 0x00000000000000000000000000000000 fpu_st1[7] 0x00000000000000000000000000000000 fpu_xmm0[1] 0x00000000000000000000000000000000 fpu_xmm1[1] 0x00000000000000000000000000000000 fpu_xmm2[1] 0x00000000000000000000000000000000 fpu_xmm3[1] 0x00000000000000000000000000000000 fpu_xmm4[1] 0x00000000000000000000000000000000 fpu_xmm5[1] 0x00000000000000000000000000000000 fpu_xmm6[1] 0x00000000000000000000000000000000 fpu_xmm7[1] 0x00000000000000000000000000000000 fpu_xmm8[1] 0x00000000000000000000000000000000 fpu_xmm9[1] 0x00000000000000000000000000000000 fpu_xmm10[1] 0x00000000000000000000000000000000 fpu_xmm11[1] 0x00000000000000000000000000000000 fpu_xmm12[1] 0x00000000000000000000000000000000 fpu_xmm13[1] 0x00000000000000000000000000000000 fpu_xmm14[1] 0x00000000000000000000000000000000 fpu_xmm15[1] 0x00000000000000000000000000000000 fpu_ymm0[1] 0x00000000000000000000000000000000 fpu_ymm1[1] 0x00000000000000000000000000000000 fpu_ymm2[1] 0x00000000000000000000000000000000 fpu_ymm3[1] 0x00000000000000000000000000000000 fpu_ymm4[1] 0x00000000000000000000000000000000 fpu_ymm5[1] 0x00000000000000000000000000000000 fpu_ymm6[1] 0x00000000000000000000000000000000 fpu_ymm7[1] 0x00000000000000000000000000000000 fpu_ymm8[1] 0x00000000000000000000000000000000 fpu_ymm9[1] 0x00000000000000000000000000000000 fpu_ymm10[1] 0x00000000000000000000000000000000 fpu_ymm11[1] 0x00000000000000000000000000000000 fpu_ymm12[1] 0x00000000000000000000000000000000 fpu_ymm13[1] 0x00000000000000000000000000000000 fpu_ymm14[1] 0x00000000000000000000000000000000 fpu_ymm15[1] 0x00000000000000000000000000000000 cr_intercept[1] 0x00000000 dr_intercept[1] 0x00000000 exc_intercept[1] 0x00000000 inst1_intercept[1] 0x00000000 inst2_intercept[1] 0x00000000 TLB ctrl[1] 0x0000000000000000 exitinfo1[1] 0x0000000000000000 exitinfo2[1] 0x0000000000000000 exitintinfo[1] 0x0000000000000000 v_irq/tpr[1] 0x0000000000000000 AVIC apic_bar[1] 0x0000000000000000 AVIC backing page[1] 0x0000000000000000 AVIC logical table[1] 0x0000000000000000 AVIC physical table[1] 0x0000000000000000 x2apic_state[1] 0 rvi/npt[1] 0x0000000000000000 exception_bitmap[1] 0 io_bitmap[1] 0 tsc_offset[1] 0x0000000000000000 msr_bitmap[1] 0 asid[1] 0x0000 msr[MSR_EFER] = 1800 msr[MSR_KGSBASE] = 0 msr[MSR_STAR] = 0 msr[MSR_LSTAR] = 0 msr[MSR_CSTAR] = 0 msr[MSR_SF_MASK] = 0 msr[MSR_SYSENTER_CS_MSR] = 0 msr[MSR_SYSENTER_ESP_MSR] = 0 msr[MSR_SYSENTER_EIP_MSR] = 0 msr[MSR_PAT] = 70406 msr[MSR_TSC (offset from system boot)] = 0 msr[MSR_MTRRcap] = 50a msr[MSR_MTRRdefType] = c06 msr[00000268] = 0 msr[00000269] = 0 msr[0000026a] = 0 msr[0000026b] = 0 msr[0000026c] = 0 msr[0000026d] = 0 msr[0000026e] = 0 msr[0000026f] = 0 msr[00000258] = 6060606 msr[00000259] = 0 msr[00000250] = 6060606 msr[00000200] = 80000000 msr[00000201] = 80000800 msr[00000202] = 0 msr[00000203] = 800 msr[00000204] = 0 msr[00000205] = 0 msr[00000206] = 0 msr[00000207] = 0 msr[00000208] = 0 msr[00000209] = 0 msr[0000020a] = 0 msr[0000020b] = 0 msr[0000020c] = 0 msr[0000020d] = 0 msr[0000020e] = 0 msr[0000020f] = 0 msr[00000210] = 0 msr[00000211] = 0 msr[00000212] = 0 msr[00000213] = 0 exit_reason[1] 0 rtc nvram[000]: 0x18 rtc time 0x63c09f8a: Fri Jan 13 00:02:18 2023 Capability "hlt_exit" is set on vcpu 1 Capability "mtrap_exit" is not available Capability "pause_exit" is not set on vcpu 1 Capability "enable_invpcid" is not available Capability "bpt_exit" is not available active cpus: 0, 1, 2, 3 suspended cpus: (none) pending: n/a current: n/a vcpu1 stats: number of NMIs delivered to vcpu 1 number of ExtINTs delivered to vcpu 0 Resident memory 0 vcpu migration across host cpus 1 total number of vm exits 935 vm exits due to external interrupt 0 number of times hlt was intercepted 6 number of times %cr access was intercepted 0 number of times rdmsr was intercepted 288 number of times wrmsr was intercepted 109 number of monitor trap exits 0 number of times pause was intercepted 0 vm exits due to interrupt window opening 0 vm exits due to nmi window opening 0 number of times in/out was intercepted 1 number of times cpuid was intercepted 363 vm exits due to nested page fault 0 vm exits for mmio emulation 121 number of vm exits for unknown reason 0 number of times astpending at exit 0 number of times idle requested at exit 0 number of vm exits due to exceptions 0 number of vm exits due to run_state change 6 EOI without any in-service interrupt 0 error interrupts generated by vlapic 0 timer interrupts generated by vlapic 0 corrected machine check interrupts generated by vlapic 0 lvts triggered[0] 0 lvts triggered[1] 0 lvts triggered[2] 0 lvts triggered[3] 0 lvts triggered[4] 0 lvts triggered[5] 0 lvts triggered[6] 0 ipis sent from vcpu 0 ipis received by vcpu 0 cpu_topology: sockets=1, cores=1, threads=1, maxcpus=32 ```

Helios guest panic

This was the serial console output from the guest panic:

Loading /platform/i86pc/amd64/boot_archive...
Loading /platform/i86pc/amd64/boot_archive.hash...
Booting...
Oxide Helios Version helios-1.0.21408 64-bit
NOTICE: Performing full ZFS device scan!
NOTICE: Cannot read the pool label from '/pseudo/lofi@1:b'
NOTICE: spa_import_rootpool: error 5
Cannot mount root on /pseudo/lofi@1:b fstype zfs

panic[cpu0]/thread=fffffffffbc4a060: vfs_mountroot: cannot mount root

Warning - stack not written to the dump buffer

It seems like zfs couldn't find the disk, which is odd because it's the same disk that it's booting from. I poked around in kmdb (escaping to the boot loader following a reboot then boot -k) and observed the following:

jordanhendricks commented 1 year ago

It seems likely that the guest panic is related to https://github.com/oxidecomputer/stlouis/issues/417.

jordanhendricks commented 1 year ago

With the above stlouis issue resolved (as well as several other bugs found by @citrus-it in getting helios to boot on propolis + nvme), I am pretty sure we can close this.

I successfully booted a helios-2.0.22094 image today using a crucible boot disk (nvme driver) on a gimlet to confirm.