moby / hyperkit

A toolkit for embedding hypervisor capabilities in your application
BSD 2-Clause "Simplified" License
3.61k stars 327 forks source link

hyperkit fails to boot 4.17 and 4.18 kernel (based on Linuxkit) #226

Closed rn closed 5 years ago

rn commented 6 years ago

Booting a linuxkit image build with the 4.18.1 kernel (to be added to LinuxKit with https://github.com/linuxkit/linuxkit/pull/3163) and in fact any 4.17 kernel via initrd+kernel I get an unhandled VM Exit reason 33.

I'm using this linuxkit YAML file (minimal.yml):

kernel:
  image: linuxkit/kernel:4.18.1
  cmdline: "console=tty0 console=ttyS0 console=ttyAMA0"
init:
  - linuxkit/init:v0.6
  - linuxkit/runc:v0.6
  - linuxkit/containerd:v0.6
onboot:
  - name: dhcpcd
    image: linuxkit/dhcpcd:v0.6
    command: ["/sbin/dhcpcd", "--nobackground", "-f", "/dhcpcd.conf", "-1"]
services:
  - name: getty
    image: linuxkit/getty:v0.6
    env:
     - INSECURE=true
trust:
  org:
    - linuxkit

and build a kernlel+initrd with linuxkit build minimal.yml and then boot it with:

hyperkit -A -u -c 1 -m 1024M -s 0:0,hostbridge -s 31,lpc -s 3,virtio-rnd -l com1,stdio -f kexec,minimal-kernel,minimal-initrd.img,"earlyprintk=serial console=ttyS0"

The error output is:

vm exit[0]
    reason      VMX
    rip     0x000000000009e019
    inst_length 3
    status      0
    exit_reason 33
    qualification   0x0000000000000000
    inst_type       0
    inst_error      0
VMCS_PIN_BASED_CTLS:           0x000000000000003f
VMCS_PRI_PROC_BASED_CTLS:      0x00000000b5186dfa
VMCS_SEC_PROC_BASED_CTLS:      0x00000000000000aa
VMCS_ENTRY_CTLS:               0x00000000000093ff
VMCS_EXCEPTION_BITMAP:         0x0000000000040000
VMCS_CR0_MASK:                 0x00000000e0000031
VMCS_CR0_SHADOW:               0x0000000000000001
VMCS_CR4_MASK:                 0x0000000000002000
VMCS_CR4_SHADOW:               0x0000000000000000
VMCS_GUEST_PHYSICAL_ADDRESS:   0x000000000009e000
VMCS_GUEST_LINEAR_ADDRESS:     0x000000000009e000
VMCS_GUEST_CS_SELECTOR:        0x0000000000000008
VMCS_GUEST_CS_LIMIT:           0x00000000ffffffff
VMCS_GUEST_CS_ACCESS_RIGHTS:   0x000000000000c09b
VMCS_GUEST_CS_BASE:            0x0000000000000000
VMCS_GUEST_DS_SELECTOR:        0x0000000000000018
VMCS_GUEST_DS_LIMIT:           0x00000000ffffffff
VMCS_GUEST_DS_ACCESS_RIGHTS:   0x000000000000c093
VMCS_GUEST_DS_BASE:            0x0000000000000000
VMCS_GUEST_ES_SELECTOR:        0x0000000000000000
VMCS_GUEST_ES_LIMIT:           0x00000000ffffffff
VMCS_GUEST_ES_ACCESS_RIGHTS:   0x000000000001c000
VMCS_GUEST_ES_BASE:            0x0000000000000000
VMCS_GUEST_FS_SELECTOR:        0x0000000000000000
VMCS_GUEST_FS_LIMIT:           0x00000000ffffffff
VMCS_GUEST_FS_ACCESS_RIGHTS:   0x000000000001c000
VMCS_GUEST_FS_BASE:            0x0000000000000000
VMCS_GUEST_GS_SELECTOR:        0x0000000000000000
VMCS_GUEST_GS_LIMIT:           0x00000000ffffffff
VMCS_GUEST_GS_ACCESS_RIGHTS:   0x000000000001c000
VMCS_GUEST_GS_BASE:            0x0000000000000000
VMCS_GUEST_SS_SELECTOR:        0x0000000000000018
VMCS_GUEST_SS_LIMIT:           0x00000000ffffffff
VMCS_GUEST_SS_ACCESS_RIGHTS:   0x000000000000c093
VMCS_GUEST_SS_BASE:            0x0000000000000000
VMCS_GUEST_LDTR_SELECTOR:      0x0000000000000000
VMCS_GUEST_LDTR_LIMIT:         0x00000000ffffffff
VMCS_GUEST_LDTR_ACCESS_RIGHTS: 0x000000000001c000
VMCS_GUEST_LDTR_BASE:          0x0000000000000000
VMCS_GUEST_TR_SELECTOR:        0x0000000000000020
VMCS_GUEST_TR_LIMIT:           0x0000000000000fff
VMCS_GUEST_TR_ACCESS_RIGHTS:   0x000000000000808b
VMCS_GUEST_TR_BASE:            0x0000000000000000
VMCS_GUEST_GDTR_LIMIT:         0x0000000000000030
VMCS_GUEST_GDTR_BASE:          0x0000000001777618
VMCS_GUEST_IDTR_LIMIT:         0x00000000ffffffff
VMCS_GUEST_IDTR_BASE:          0x0000000000000000
VMCS_GUEST_CR0:                0x0000000000000031
VMCS_GUEST_CR3:                0x0000000002d31000
VMCS_GUEST_CR4:                0x0000000000002020
VMCS_GUEST_IA32_EFER:          0x0000000000000500

rip: 0x000000000009e019 rfl: 0x0000000000000047 cr2: 0x0000000000000000
rax: 0x0000000000000001 rbx: 0x00000000025a2000 rcx: 0x000000000009d000 rdx: 0x0000000000000000
rsi: 0x0000000000003000 rdi: 0x0000000001000294 rbp: 0x0000000001000000 rsp: 0x000000000009f000
r8:  0x000000000009d000 r9:  0x0000000000000000 r10: 0x0000000000000000 r11: 0x0000000000000000
r12: 0x0000000000000000 r13: 0x0000000000000000 r14: 0x0000000000000000 r15: 0x0000000000000000

I've tried a few 4.17 kernels (4.17.15, 4.17.14, 4.17.1) and get the same error.

Booting the efi-iso with the 4.18 kernel I get the same error.

The 4.14.63 kernel from the same PR boots fine so does the last 4.16.18 kernel

The 4.18 and 4.17 kernels boot fine with qemu.

hyperkit version: v0.20180403-17-g3e954c linuxkit version: v0.6+ c0aecf8f26a7abcd7f6c5b1dddce06ded061e1ff

zhsj commented 6 years ago

Hi rn, I see the PR in linuxkit has been merged. Have you figured out what happens to kernel >= 4.17? Although I don't use hyperkit, but I meet troubles with xhyve, and can't boot kernel >= 4.17.

rn commented 6 years ago

xhyve would have the same issue. Something in the newer kernels is causing the above VMEXIT which then is not handled by hyperkit. I hadn't had a chance to investigate....and likely won't have for a while. Maybe @ijc has an idea?

ijc commented 6 years ago

IIRC from when I glanced at this before the exit reason is just the catch all "invalid VMCS" one. Someone needs to sit down with the SDM and go through the check list of required invariants to compare it to the VMCS state and figure out which one is being violated. It'd also be worth figuring out what the last few exits prior to the problem were about (I guess the dtrace stuff can help with that), since that might make it easier to narrow in on a section of the (long and boring) checklist.

I won't have cycles for this for a while either I expect.

rn commented 6 years ago

Thanks, I pretty much got as far as you did.

@zhsj have you tried. a recent version of qemu? It now also uses the Hypervisor framework (just as hyperkit and xhyve) for acceleration (called hvf) so is as fast hyperkit/xhyve on macOS and boots the recent kernels fine

zhsj commented 6 years ago

@rn just tried qemu-3.0, it works in hvf mode with 4.17 kernel, launched with -accel hvf -cpu host args.

frezbo commented 5 years ago

Any progress on this? I'm trying to boot Fedora 29, xhyve is able to boot, so the hyperkit code base must be missing some patch.

rn commented 5 years ago

as far as I know no-one looked into this so far.

grehan-freebsd commented 5 years ago

This should fix it: https://github.com/machyve/xhyve/commit/cc672d5363766e7c2bf10e02ca12efbeda74c487

rn commented 5 years ago

@grehan-freebsd thanks for the pointer. I will take a look. If it fixes it, would you like to prepare a PR or are you happy for me to do it (and credit the original patch, of course).

grehan-freebsd commented 5 years ago

Happy for you to do it :)