machyve / xhyve

xhyve, a lightweight OS X virtualization solution
Other
6.44k stars 354 forks source link

Running OpenBSD on xhyve does not work #179

Closed mikroskeem closed 3 years ago

mikroskeem commented 4 years ago

How to reproduce: 1) Build xhyve from git (I built from 1f46a3d0bbeb6c90883f302425844fcc3800a776) 2) Grab OpenBSD miniroot66.fs and bhyve UEFI from FreeBSD 3) Use this script to boot xhyve 4) Connect to VNC 5) See that VNC is dark and won't display anything 6) See xhyve spit out output to stdout:

probing: pc0 com0 com1 mem[640K 1513M 16M 4M 64K]
disk: hd0 hd1*
>> OpenBSD/amd64 BOOTX64 3.46
boot>
cannot open hd0a:/etc/random.seed: No such file or directory
booting hd0a:/bsd: 3732171+1537024+3885432+0+598016=0x94f360
entry point at 0x1001000
rdmsr to register 0xc80 on vcpu 0
                                 XHYVE: vlapic callout at 0x2522.0x4652049175f684e9, expected at 0x2522.#4670e59cbdadf577

7) xhyve exits

Am I doing something wrong?

mikroskeem commented 4 years ago

Also running on Mac OS X Catalina 10.15 & MBP 2018, if that info is relevant.

trevormeier commented 4 years ago

Were you able to get this working @mikroskeem ?

mikroskeem commented 4 years ago

No.

---- On Tue, 12 Nov 2019 11:32:44 +0000 notifications@github.com wrote ----

Were you able to get this working @mikroskeem ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

adaugherity commented 4 years ago

At first I thought it was this upstream bhyve issue -- before that bugfix, it didn't properly emulate an instruction sequence generated by LLVM 8 (the compiler OpenBSD uses for 6.6), but I don't think that's actually the problem here, since I don't see a "Failed to emulate instruction" message. Perhaps once the MSR 0xc80 is implemented in xhyve so -w isn't required, it will go further down that codepath and that fix will be required also.

Without -w, OpenBSD panics:

cpu0: 256KB 64b/line 8-way L2 cache
rdmsr to register 0xc80 on vcpu 0
                                 kernel: protection fault trap, code=0
Stopped at      identifycpu+0x9f2:      rdmsr
ddb>

I haven't been able to get VNC working properly with any xhyve guest -- what VNC clients are recommended? The built-in Screen Sharing fails to connect, although it does enough to kick off bootup (which waits for a VNC connection), and TigerVNC has major problems: the keyboard doesn't work properly, and the display is shown at 1x in the lower quarter of the VNC window (probably a HiDPI issue).

Using the serial console (no fbuf device, and remember to set tty com0 at the OpenBSD bootloader prompt), e1000, and the aforementioned -w, both 6.5 and 6.6 work fine for me under macOS 10.14.6, although OpenBSD only finds once CPU, even with bsd.mp.

mikroskeem commented 4 years ago

So you got it that far, interesting. Can you provide an exact command line so I could try it out myself?

adaugherity commented 4 years ago

This is my command line for installation (xhyve 1dd9a51):

UUID=DFCF1091-7EB7-...        # uuidgen(1)
FIRMWARE=~/BHYVE_UEFI.fd

sudo xhyve \
    -HP -U $UUID -w \
    -m 1G \
    -c 2 \
    -s 0:0,amd_hostbridge \
    -s 2:0,virtio-blk,~/Downloads/BSD/miniroot66.fs \
    -s 2:1,virtio-blk,obsd.img \
    -s 4,e1000 \
    -s 31,lpc \
    -l com1,stdio \
    -l bootrom,$FIRMWARE

Setting the serial console in the OpenBSD boot loader (set tty com0) is required; without that, xhyve exits shortly after boot like what you experience.

For normal booting, I replace both virtio-blk lines with -s 2:0,virtio-blk,obsd.img and have the serial console configured in /etc/boot.conf.

virtio-net crashes immediately (#164) so I use e1000 instead.

Interestingly the -A option seems to make no difference, as the acpidump output in OpenBSD is identical with or without it.

mikroskeem commented 4 years ago

Networking does not seem to work for me at all. Any ideas why?

adaugherity commented 4 years ago

No idea... does it work in other xhyve guests (e.g. Linux)?

I just submitted #182 which fixes the rdmsr error and allows running without -w.

I also discovered that with a VNC framebuffer enabled, setting OpenBSD to a lower resolution via machine gop N avoids xhyve crashing after loading the installer kernel. OpenBSD's efifb defaults to setting the highest possible resolution, (1920x1200 for me); I set it to 1024x768 instead and that helped. Actually all lower resolutions up to 1280x720 worked, but 1280x1024 and above did not.

The serial console works much better for me though.

adaugherity commented 4 years ago

At first I thought it was this upstream bhyve issue -- before that bugfix, it didn't properly emulate an instruction sequence generated by LLVM 8 (the compiler OpenBSD uses for 6.6), but I don't think that's actually the problem here, since I don't see a "Failed to emulate instruction" message. Perhaps once the MSR 0xc80 is implemented in xhyve so -w isn't required, it will go further down that codepath and that fix will be required also.

That issue does indeed affect xhyve, and it's in fact unrelated to the rdmsr issue. It didn't appear at first, because in the default state, OpenBSD only detects a single CPU (even when using bsd.mp), which apparently does not hit this code path.

However, if I disable ACPI support in OpenBSD, it uses the mpbios tables and successfully finds all virtual CPUs. (I'm not yet sure whether this is a bug in xhyve or in OpenBSD.) OpenBSD 6.5 works properly with multiple CPUs and ACPI disabled, but 6.6 hits this bug:

>> OpenBSD/amd64 BOOTX64 3.46
boot> -c
booting hd0a:/bsd: 12744008+2941968+340000+0+708608 [989400+128+1013016+740763]=0x1296010
entry point at 0x1001000
[ using 2744336 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2019 OpenBSD. All rights reserved.  https://www.OpenBSD.org

OpenBSD 6.6 (GENERIC.MP) #372: Sat Oct 12 10:56:27 MDT 2019
    deraadt@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 1056280576 (1007MB)
avail mem = 1011605504 (964MB)
User Kernel Config
UKC> disable acpi
445 acpi0 disabled
UKC> quit
Continuing...
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x3fb5a000 (10 entries)
bios0: vendor BHYVE version "1.00" date 03/14/2014
bios0: bhyve BHYVE
acpi at bios0 not configured
mpbios0 at bios0: Intel MP Specification 1.4
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, 4008.70 MHz, 06-5e-03
cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,PBE,SSE3,PCLMUL,DTES64,DS-CPL,SSSE3,SDBG,FMA3,CX16,xTPR,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,ITSC,FSGSBASE,BMI1,HLE,AVX2,BMI2,ERMS,RTM,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: CPU supports MTRRs but not enabled by BIOS
cpu0: apic clock running at 24MHz
cpu1 at mainbus0: apid 1 (application processor)
Failed to emulate instruction [0xf7 0x04 0x25 0x00 0x33 0xf0 0x81 0x00 0x10 0x00 0x00] at 0xffffffff8192fc00
                                                                                                            ./xhyverun-obsd66.sh: line 55: 29275 Abort trap: 6           sudo $XHYVE -AHP -U $UUID -m 1G -c 2 -s 0:0,amd_hostbridge $DISKS -s 4:0,e1000 $VNC -s 31,lpc -l com1,stdio -l bootrom,$FIRMWARE

The addresses are slightly different than in the bhyve bug but it's the same basic pattern:

ffffffff8192fc00:       f7 04 25 00 33 f0 81    testl  $0x1000,0xffffffff81f03300
ffffffff8192fc07:       00 10 00 00
ffffffff8192fc0b:       74 08                   je     ffffffff8192fc15 <x86_ipi_init+0xa5>
ffffffff8192fc0d:       f3 90                   pause

(Found via objdump -d /bsd > objdump.txt and searching for the address.) Presumably, importing the fix from bhyve would fix this.

adaugherity commented 3 years ago

At first I thought it was this upstream bhyve issue -- before that bugfix, it didn't properly emulate an instruction sequence generated by LLVM 8 (the compiler OpenBSD uses for 6.6), but I don't think that's actually the problem here, since I don't see a "Failed to emulate instruction" message. Perhaps once the MSR 0xc80 is implemented in xhyve so -w isn't required, it will go further down that codepath and that fix will be required also.

That issue does indeed affect xhyve, and it's in fact unrelated to the rdmsr issue. It didn't appear at first, because in the default state, OpenBSD only detects a single CPU (even when using bsd.mp), which apparently does not hit this code path.

However, if I disable ACPI support in OpenBSD, it uses the mpbios tables and successfully finds all virtual CPUs. (I'm not yet sure whether this is a bug in xhyve or in OpenBSD.) OpenBSD 6.5 works properly with multiple CPUs and ACPI disabled, but 6.6 hits this bug.

FYI, I have imported the bhyve fix for this in PR #214.