oxidecomputer / propolis

VMM userspace for illumos bhyve
Mozilla Public License 2.0
180 stars 22 forks source link

panicked at 'vCPU 3: Unhandled VM exit: Paging(4294967292, 2)' #340

Open jmpesp opened 1 year ago

jmpesp commented 1 year ago

In testing for https://github.com/oxidecomputer/propolis/issues/333, I built a new helios image using the tools at helios-engvm, and booted that this morning. Propolis had a different panic:

Mar 10 14:43:33.848 INFO rdmsr, msr: 3221291039, vcpu: 0, component: vcpu_tasks
Mar 10 14:43:33.849 INFO wrmsr, value: 70368744177664, msr: 3221291039, vcpu: 0, component: vcpu_tasks
thread 'vcpu-3' panicked at 'vCPU 3: Unhandled VM exit: Paging(4294967292, 2)', bin/propolis-server/src/lib/vm/mod.rs:436:9
stack backtrace:
   0:          0x1cddb9c - std::backtrace_rs::backtrace::libunwind::trace::h9b534fc6095f520f
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:          0x1cddb9c - std::backtrace_rs::backtrace::trace_unsynchronized::hfcd8a890f991a10d
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:          0x1cddb9c - std::sys_common::backtrace::_print_fmt::hdf432e5f0940fa9c
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/sys_common/backtrace.rs:65:5
   3:          0x1cddb9c - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h5693a803500ade80
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/sys_common/backtrace.rs:44:22
   4:          0x1d352ca - core::fmt::write::h94f582bdfeeeb1a6
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/core/src/fmt/mod.rs:1209:17
   5:          0x1cce984 - std::io::Write::write_fmt::hd3f00507505f204d
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/io/mod.rs:1682:15
   6:          0x1cdd970 - std::sys_common::backtrace::_print::h21d778336866aef0
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/sys_common/backtrace.rs:47:5
   7:          0x1cdd970 - std::sys_common::backtrace::print::h8238ed32ff302292
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/sys_common/backtrace.rs:34:9
   8:          0x1ce06d6 - std::panicking::default_hook::{{closure}}::h933e4cf16faea6e2
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/panicking.rs:267:22
   9:          0x1ce036c - std::panicking::default_hook::h9576f50ee054cc3c
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/panicking.rs:286:9
  10:          0x1ce0ffc - std::panicking::rust_panic_with_hook::hb09aca5ad1ad2161
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/panicking.rs:688:13
  11:          0x1ce0d75 - std::panicking::begin_panic_handler::{{closure}}::h189c984e260f383e
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/panicking.rs:579:13
  12:          0x1cde040 - std::sys_common::backtrace::__rust_end_short_backtrace::hcab55adb02d49332
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/sys_common/backtrace.rs:137:18
  13:          0x1ce0aa1 - rust_begin_unwind
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/panicking.rs:575:5
  14:          0x1d31c53 - core::panicking::panic_fmt::hc653fbe903263e77
                               at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/core/src/panicking.rs:65:14
  15:           0xf3c3f8 - propolis_server::vm::SharedVmState::unhandled_vm_exit::h4a61df6e3da484d8
[ Mar 10 06:43:37 Stopping because all processes in service exited. ]
[ Mar 10 06:43:37 Executing stop method (:kill). ]

The serial console showed:

Loading unix...
Loading /platform/i86pc/amd64/boot_archive...
Loading /platform/i86pc/amd64/boot_archive.hash...
Booting...
Oxide Helios Version helios-1.0.21472 64-bit
NOTICE: Performing full ZFS device scan!
NOTICE: Cannot read the pool label from '/pseudo/lofi@1:b'
NOTICE: spa_import_rootpool: error 5
Cannot mount root on /pseudo/lofi@1:b fstype zfs

panic[cpu0]/thread=fffffffffbc4a060: vfs_mountroot: cannot mount root

Warning - stack not written to the dump buffer
fffffffffbc8a2d0 fffffffffbae5897 ()
fffffffffbc8a310 genunix:main+137 ()
fffffffffbc8a320 unix:_locore_start+88 ()

skipping system dump - no dump device configured
rebooting...
leftwo commented 1 year ago

I saw this same panic many months ago when using crucible as a NVMe device. What is in your propolis config toml files for this setup, or are you seeing this through Omicron?

jmpesp commented 1 year ago

This is through Omicron, so it's using an NVMe device

gjcolombo commented 1 year ago

This is possibly the same as #300, except that this issue has a different unhandled exit type.