intel / haxm

Intel® Hardware Accelerated Execution Manager (Intel® HAXM)
BSD 3-Clause "New" or "Revised" License
3.21k stars 870 forks source link

Solaris 10 #GP #173

Open polprog opened 5 years ago

polprog commented 5 years ago

Describe the Bug This bug makes it unable to boot solaris 10 in QEMU/HAXM on NetBSD/amd64

Summary: Solaris 10 install CD bootloader fails to boot, triggers a GPF (code 0)

Host Environment

Guest Environment

To Reproduce

  1. Boot in qemu with default options, a little more RAM and HAX enabled qemu-system-i386 -accel hax -cdrom [solaris install CD iso] -m 4G

Expected Behavior Solaris boots as if it was a physical machine.

Reproducibility Always

Diagnostic Information

Host crash dump: n/a

HAXM log: No useful information (Only the start version info, HAX_LOWMEM_4G ignored and hax_teardown_vm, we've already recompiled it with noisiest loglevel)

Screenshots: Boot panic, solaris prints a very verbose panic with a stack dump

Additional context This install medium uses GNU GRUB for boot option selection. GRUB works fine, then there is a sequence of dots (indicating loading of some sort), then panic (screenshot below)

Host and env same as in #172

raphaelning commented 5 years ago

Thanks. This looks similar to #172, except that the guest exception is #GP (General Protection Fault). But still, the first step is to disassemble the guest code and locate the faulting instruction (%cs : %eip = 0x0010:0x0100f494).

polprog commented 5 years ago

Ive extracted the offending instruction. Its related to MSR as in #172

   0x0100f486:  nop
   0x0100f487:  nop
   0x0100f488:  nop
   0x0100f489:  nop
   0x0100f48a:  nop
   0x0100f48b:  nop
   0x0100f48c:  nop
   0x0100f48d:  nop
   0x0100f48e:  nop
   0x0100f48f:  nop
   0x0100f490:  mov    0x4(%esp),%ecx
   0x0100f494:  rdmsr  
   0x0100f496:  mov    0x8(%esp),%ecx
   0x0100f49a:  mov    %eax,(%ecx)
   0x0100f49c:  mov    %edx,0x4(%ecx)
   0x0100f49f:  ret    
   0x0100f4a0:  mov    0x8(%esp),%ecx
   0x0100f4a4:  mov    (%ecx),%eax
   0x0100f4a6:  mov    0x4(%ecx),%edx
   0x0100f4a9:  mov    0x4(%esp),%ecx
   0x0100f4ad:  wrmsr  
   0x0100f4af:  ret  

I don't know which way to read the stack dump but there is a 0xc0000080 (leftmost col, 2nd from the top) - looks like MSR number. I'll set up a breakpoint at 0x0100f490 and peek at the stack.

krytarowski commented 5 years ago

It looks like %ecx is IA32_EFER and %eip on rdmsr so the same issue.

Does it boot for you with this patch: https://github.com/intel/haxm/issues/172#issuecomment-464296030

polprog commented 5 years ago

Very good! So we have the bug reproduced on two different systems. Ill test the patch.

polprog commented 5 years ago

Unfortunately this patch crashes the host system for me, before the GRUB even appears. sol-msr-crash1 sol-msr-crash2

polprog commented 5 years ago

My fault for having old haxm, after updating to HEAD and applying @krytarowski 's patch it boots up to a later stage.

The kernel crashes though as this turned out to be a 64-bit os, the 32 bit bootloader works for sure.

krytarowski commented 5 years ago

The kernel crashes though as this turned out to be a 64-bit os, the 32 bit bootloader works for sure.

Please test with qemu-system-x86_64

krytarowski commented 5 years ago

https://github.com/intel/haxm/issues/172#issuecomment-465471085 Please test the newest HAXM with this patch applied... and use qemu-system-x86_64.

sskras commented 5 years ago

Any news about Sol 10 ? BTW, which particular Solaris release do you use for testing?

polprog commented 5 years ago

@sskras Solaris 10u10. I have not tested it yet with the newest patches like #185

krytarowski commented 5 years ago

Solaris needs CR8 support.