bochs-emu / Bochs

Bochs - Cross Platform x86 Emulator Project
https://bochs.sourceforge.io/
GNU Lesser General Public License v2.1
875 stars 102 forks source link

If the SVM GUEST is in protected mode(HOST CR4_PSE=1,NPT enable), there may be a problem here. #234

Closed lgj1107 closed 8 months ago

lgj1107 commented 8 months ago

https://github.com/bochs-emu/Bochs/blob/7ad1fd00881544ef7ef9f51abc1bf9122d369424/bochs/cpu/paging.cc#L1178

lgj1107 commented 8 months ago

00508250000i[CPU0 ] CPU is in protected mode (active) 00508250000i[CPU0 ] CS.mode = 32 bit 00508250000i[CPU0 ] SS.mode = 32 bit 00508250000i[CPU0 ] EFER = 0x00001000 00508250000i[CPU0 ] | EAX=00000233 EBX=00000000 ECX=00000000 EDX=80000001 00508250000i[CPU0 ] | ESP=00007c00 EBP=00000000 ESI=00102000 EDI=000b83e2 00508250000i[CPU0 ] | IOPL=0 id vip vif ac vm rf nt of df if tf SF zf AF PF CF 00508250000i[CPU0 ] | SEG sltr(index|ti|rpl) base limit G D 00508250000i[CPU0 ] | CS:0008( 0001| 0| 0) 00000000 ffffffff 1 1 00508250000i[CPU0 ] | DS:0010( 0002| 0| 0) 00000000 ffffffff 1 1 00508250000i[CPU0 ] | SS:0010( 0002| 0| 0) 00000000 ffffffff 1 1 00508250000i[CPU0 ] | ES:0010( 0002| 0| 0) 00000000 ffffffff 1 1 00508250000i[CPU0 ] | FS:0000( 0005| 0| 0) 00000000 0000ffff 0 0 00508250000i[CPU0 ] | GS:0000( 0005| 0| 0) 00000000 0000ffff 0 0 00508250000i[CPU0 ] | EIP=000006b1 (000006b1) 00508250000i[CPU0 ] | CR0=0x80000031 CR2=0x00000000 00508250000i[CPU0 ] | CR3=0x00005000 CR4=0x00000000 00508250000i[CPU0 ] 0x00000000000006b1: (instruction unavailable) page not present

stlintel commented 8 months ago

I don't understand what kind of problem you mean.

BX_CPU_THIS_PTR cr4.get_PSE() meaning guest PSE because under SVM cr4 has guests's CR4 nested walk is done though separate function:

bx_phy_address BX_CPU_C::nested_walk(bx_phy_address guest_paddr, unsigned rw, bool is_page_walk)
{
  SVM_HOST_STATE *host_state = &BX_CPU_THIS_PTR vmcb->host_state;

  BX_DEBUG(("Nested walk for guest paddr 0x" FMT_PHY_ADDRX, guest_paddr));

  if (host_state->efer.get_LMA())
    return nested_walk_long_mode(guest_paddr, rw, is_page_walk);
  else if (host_state->cr4.get_PAE())
    return nested_walk_PAE(guest_paddr, rw, is_page_walk);
  else
    return nested_walk_legacy(guest_paddr, rw, is_page_walk);
}

and inside it is host PSE:

    if ((curr_entry & 0x80) != 0 && host_state->cr4.get_PSE()) {
      // 4M paging, only if CR4.PSE enabled, ignore PDE.PS otherwise
      if (curr_entry & PAGING_PDE4M_RESERVED_BITS) {

Please explain what is the problem

lgj1107 commented 8 months ago

This does not take into account the situation where NPT=1 and NPT also uses PSE pages, with the guest PG=1 and PSE=0. Therefore, it is not possible to disassemble the instruction.

00508250000i[CPU0 ] 0x00000000000006b1: (instruction unavailable) page not present

Let me think about how to explain this problem

stlintel commented 8 months ago

This is solely disasm problem, not execution. Disasm uses dbg_xlate_linear2phy to translate addresses. With NPT it is:

#if BX_SUPPORT_SVM
bool BX_CPU_C::dbg_translate_guest_physical_npt(bx_phy_address guest_paddr, bx_phy_address *phy, bool verbose)
{
  // Nested page table walk works in the same manner as the standard page walk.
  return dbg_xlate_linear2phy(guest_paddr, phy, NULL, verbose, true);
}
#endif

But the function dbg_xlate_linear2phy has all the issues you mention: it refers to guest CR4 everywhere as well as guest PDPTR.

#if BX_CPU_LEVEL >= 6
    if (BX_CPU_THIS_PTR cr4.get_PAE()) {
      int level = BX_LEVEL_PDE;
      if (! long_mode()) {
        offset_mask = 0x3fffffff;
        pt_address = **_BX_CPU_THIS_PTR PDPTR_CACHE.entry[(laddr >> 30) & 3];_**
        if (! (pt_address & 0x1))
           goto page_fault;
        pt_address &= BX_CONST64(0x000ffffffffff000);
      }
#if BX_SUPPORT_X86_64
      else {
        level = **_BX_CPU_THIS_PTR cr4.get_LA57() ? BX_LEVEL_PML5 : BX_LEVEL_PML4;_**
        offset_mask = ((BX_CONST64(1) << BX_CPU_THIS_PTR linaddr_width) - 1);
      }
#endif

and also

**_if (level == BX_LEVEL_PDE && (pte & 0x80) != 0 && BX_CPU_THIS_PTR cr4.get_PSE()) {_**

lgj1107 commented 8 months ago

Yes, there is no problem in execution, but there is a problem in disassembly.

stlintel commented 8 months ago

The solution would be to read CR4 in the beginning of function. Please try this patch:

diff --git a/bochs/cpu/paging.cc b/bochs/cpu/paging.cc
index a65af0af4..fc2ef8f3a 100644
--- a/bochs/cpu/paging.cc
+++ b/bochs/cpu/paging.cc
@@ -2314,14 +2314,16 @@ bool BX_CPU_C::dbg_xlate_linear2phy(bx_address laddr, bx_phy_address *phy, bx_ad
   }
   else {
     bx_phy_address pt_address = BX_CPU_THIS_PTR cr3 & BX_CR3_PAGING_MASK;
+    bx_cr4_t the_cr4 = BX_CPU_THIS_PTR cr4;
 #if BX_SUPPORT_SVM
     if (nested_walk) {
       pt_address = LPFOf(BX_CPU_THIS_PTR vmcb->ctrls.ncr3);
+      the_cr4 = BX_CPU_THIS_PTR vmcb->host_state.cr4;
     }
 #endif

 #if BX_CPU_LEVEL >= 6
-    if (BX_CPU_THIS_PTR cr4.get_PAE()) {
+    if (the_cr4.get_PAE()) {
       int level = BX_LEVEL_PDE;
       if (! long_mode()) {
         offset_mask = 0x3fffffff;
@@ -2332,7 +2334,7 @@ bool BX_CPU_C::dbg_xlate_linear2phy(bx_address laddr, bx_phy_address *phy, bx_ad
       }
 #if BX_SUPPORT_X86_64
       else {
-        level = BX_CPU_THIS_PTR cr4.get_LA57() ? BX_LEVEL_PML5 : BX_LEVEL_PML4;
+        level = the_cr4.get_LA57() ? BX_LEVEL_PML5 : BX_LEVEL_PML4;
         offset_mask = ((BX_CONST64(1) << BX_CPU_THIS_PTR linaddr_width) - 1);
       }
 #endif
@@ -2408,7 +2410,7 @@ bool BX_CPU_C::dbg_xlate_linear2phy(bx_address laddr, bx_phy_address *phy, bx_ad
           goto page_fault;
         pt_address = pte & 0xfffff000;
 #if BX_CPU_LEVEL >= 6
-        if (level == BX_LEVEL_PDE && (pte & 0x80) != 0 && BX_CPU_THIS_PTR cr4.get_PSE()) {
+        if (level == BX_LEVEL_PDE && (pte & 0x80) != 0 && the_cr4.get_PSE()) {
           offset_mask = 0x3fffff;
           pt_address = pte & 0xffc00000;
 #if BX_PHY_ADDRESS_WIDTH > 32
stlintel commented 8 months ago

Can you check the patch suggested and if works - will promote it to master

stlintel commented 8 months ago

Still not complete fix as it doesn't handle yet :

      int level = BX_LEVEL_PDE;
      if (! long_mode()) {
        offset_mask = 0x3fffffff;
        pt_address = BX_CPU_THIS_PTR PDPTR_CACHE.entry[(laddr >> 30) & 3];
        if (! (pt_address & 0x1))
           goto page_fault;
        pt_address &= BX_CONST64(0x000ffffffffff000);
      }

under nested walk instead of long_mode should check host_state->efer.get_LMA() instead of EFER.LMA() also under nested walks need to load PDPTR and not read PDPTR cache. but I need a test case to play with those.

lgj1107 commented 8 months ago

test1.zip Try this

stlintel commented 8 months ago

test1.zip Try this

Where should I break it to start looking for disasm ? Do you have magic break in the code for example ?

Also: does the patch above works at least ?

lgj1107 commented 8 months ago

The patch is invalid.

I have tested it on a real machine and it works normally. Try here, I added xchg bx,bx test1.zip

stlintel commented 8 months ago

I pushed the fix. The disasm is working fine in your case now as well as 'page' command in the debugger