cloudius-systems / osv

OSv, a new operating system for the cloud.
osv.io
Other
4.06k stars 603 forks source link

Support boot on hyperkit/xhyve on Mac OSX #948

Closed wkozaczuk closed 4 years ago

wkozaczuk commented 6 years ago

Currently there are two hypervisors/emulators OSv can run on Mac OSX - QEMU and VirtualBox. Neither of them is ideal: QEMU is very slow as there is not KVM support on Mac and VirtualBox has its own problems and is pretty heavy.

So it would be nice to run OSv on HyperKit which is a successor of xhyve/bhyve (QEMU/KVM-like hypervisor created for FreeBSD and ported to OSX as xhyve) and is part of Docker for Mac. Here are some links about HyperKit/xhyve/bhyve:

Based on my research it seems that it should be possible to run OSv as hyperkit/xhyve supports virtio devices, hpet, etc (read https://github.com/moby/hyperkit/blob/master/README.xhyve.md#what-is-bhyve). Also it seems there are two possible ways to run OSv (https://github.com/moby/hyperkit/blob/master/src/hyperkit.c#L784-L792) - kexec (as Linux) and fbsd (as FreeBSD).

In order to run OSv as Linux we would need to make OSv image look like Linux vmlinuz executable (please see details in here https://github.com/moby/hyperkit/blob/2df4efa17c0fba3831025ca58d67b42ffe4648fb/src/lib/firmware/kexec.c#L242-L290).

Alternatively we might run it as FreeBSD (OSv provides Linux-like ABI to apps but as far as booting is it more like Linux of FreeBSD?). In order to run it as FreeBSD we would need to expose a symbol "loader_main" and make hyperkit call it (not sure if it would work as I am not sure if FreeBSD would like OSv loader.elf). Please see for more details here - https://github.com/moby/hyperkit/blob/2df4efa17c0fba3831025ca58d67b42ffe4648fb/src/lib/firmware/fbsd.c#L944-L1003 - where you can see how it calls dlopen() on a image file calls "loader_main".

Obviously what I wrote is based on my limited understanding of OSv boot process and its internals so I might be wrong.

wkozaczuk commented 6 years ago

I have been experimenting more and here are some results.

The "fsbd" (aka FreeBSD) executable option looks like a dead end. When I tried to boot using loader-stripped.elf hyperkit would complain with this:

dlopen(loader-stripped.elf, 4): no suitable image found. Did find: loader-stripped.elf: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x03

Based on what I read about FreeBSD it comes with Linux compatibility layer and is able to execute Linux executables. But OSX (derived from some early version of FreeBSD) does not support this functionality most likely. Unless there is a way to link loader.elf to make it FreeBSD-like. I tried to force loader.elf to be FreeBSD ABI using brandelf but it also did not work and I got the same error. Even if dlopen() succeeded and executed loader_main it is not clear it would work as the layout of code in memory might not be where OSv expects things to be if I make any sense.

The "kexec" option (how Linux is run on xhyve) requires an OSv image that looks like vmlinuz (aka bzImage, described here). Based on this code it opens the 'vmlinuz' image file and reads header information at 01F1, verifies various fields and calculates kernel base which is what it assignes to RIP and finally VM gets started.

So I was thinking that in order to make OSv bootable as Linux on hyperkit/xhyve we would have to wrap loader.elf (or lzloader.elf) into something that would have a Linux-like header (like descibed here that in essence would tell hyperkit where OSv kernel starts - either lzentry or loader offsets per arch/x64/boot16.S file.

Hypekit also expects initrd file (Linux ramdisk) but we could provide some dummy one as OSv does not need to ready any data from it.

nyh commented 6 years ago

If VirtualBox works on Mac, I would use that, but obviously there is nothing wrong with getting another hypervisor to work too. I'm not familiar with the xhyve/whatever hypervisor. I'm surprised it can't take a standard disk image with PC partition table and boot from it - it only works for Linux, not Windows guests? By the way, qemu also has an option to use a Linux bzImage and initrd (the "-kernel" option) but I never tried to use it, or make OSv fit it. I have no idea how difficult it would be to do, but it sounds fairly easy and like you're on the right track. It can be even tested on qemu (with the -kernel option), I guess.

wkozaczuk commented 6 years ago

Indeed VirtualBox works on Mac and is faster than Qemu and after I fixed the annoying #917 issue it behaves quite well. But there are many annoying issues with capstan.

So am I on the right track to think that to make OSv boot using bzImage and initrd (the "-kernel" option) I simply need to build an OSv image like bzImage that would bypass arch/x64/boot16.S and make it jump to lzentry or loader? This new 'bzImage' image would also need to have this special Linux kernel image header that describes where kernel starts - lzentry/loader function - as explained https://www.kernel.org/doc/Documentation/x86/boot.txt, right?

wkozaczuk commented 6 years ago

I have made pretty good headway making OSv boot on hyperkit.

First of I abandoned "kexec" and switched to multiboot approach supported by hyperkit as well. OSv does not support multiboot yet I managed to hack something together that for now is enough to make hyperkit start OSv - https://github.com/wkozaczuk/osv/blob/multiboot/arch/x64/multiboot_header.asm and https://github.com/wkozaczuk/osv/blob/multiboot/arch/x64/multiboot.S where I hard-coded low and high memory ranges. Eventually I will create separate issue to add multiboot support to OSv.

My first impression with hyperkit is that it is pretty light - it takes around 100ms to boot OSv along with hypervisor runtime itself.

Even with multiboot support there are at least two issues that need to be addressed in hyperkit or corresponding workarounds created in OSv.

First of this assert in hpet.cc fails - https://github.com/wkozaczuk/osv/blob/multiboot/drivers/hpet.cc#L52

auto cap = mmio_getl(_addr + HPET_CAP); assert(cap & HPET_CAP_COUNT_SIZE);

I think in essence OSv requires 64-bit counters and hyperkit supports 32-bit ones. Given linux can boot on hyperkit there must be a workaround it. Would it be as simple as adding 32/64 flag to hpet.cc and change logic to something like this?

s64 hpetclock::time() { if( 64_bit_mode) return _wall + (mmio_getq(_addr + HPET_COUNTER) * _period); else return _wall + ((u64)mmio_getl(_addr + HPET_COUNTER) * _period); }

Lastly the most important thing is about APIC. Hyperkit/bhyve claims to support both xAPIC and xw2APIC but only in some limited way as the logic to enable it (wrmsr in apic.cc) causes hyperkit to exit:

void xapic::enable() { wrmsr(msr::IA32_APIC_BASE, _apic_base | APIC_BASE_GLOBAL_ENABLE); // this causes exit debug_early("### enable -> after wrmsr\n"); software_enable(); }

or

void x2apic::enable() { debug_early("### In x2apic::enable()\n"); wrmsr(msr::IA32_APIC_BASE, _apic_base | APIC_BASE_GLOBAL_ENABLE | (1 << 10)); // this causes exit software_enable(); }

Hyperkit output: `### In xapic

Before enable

wrmsr to register 0x1b(0xfee00800) on vcpu 0 vm exit[0] reason VMX`

I do not understand this topic well enough to reason about. I was going to to open the tickets with hyperkit but before I would like to get as much understanding of the issue as possible.

I also this info from hyperkit/bhyve wiki page may be helpful - https://github.com/moby/hyperkit/blob/master/README.xhyve.md:

TODO

vmm: enable APIC access page to speed up APIC emulation (performance) enable x2APIC MSRs (even faster) (performance)

and

bhyve is the FreeBSD hypervisor, roughly analogous to KVM + QEMU on Linux. It has a focus on simplicity and being legacy free.

It exposes the following peripherals to virtual machines:

Local x(2)APIC IO-APIC 8259A PIC 8253/8254 PIT HPET PM Timer RTC PCI host bridge passthrough UART AHCI (i.e. HDD and CD) VirtIO block device VirtIO networking VirtIO RNG Notably absent are sound, USB, HID and any kind of graphics support. With a focus on server virtualization this is not strictly a requirement. bhyve may gain desktop virtualization capabilities in the future but this doesn't seem to be a priority.

Unlike QEMU, bhyve also currently lacks any kind of guest-side firmware (QEMU uses the GPL3 SeaBIOS), but aims to provide a compatible OVMF EFI in the near future. It does however provide ACPI, SMBIOS and MP Tables.

wkozaczuk commented 4 years ago

This issue can be closed as now it is possible to boot OSv on hyperkit even using capstan.

Following commits made it possible: