intel / MultiArchUefiPkg

Multi-Architecture UEFI Environment Driver
GNU Lesser General Public License v2.1
51 stars 11 forks source link

MUA crash in unrelated driver after a previous x64 driver failed to load #49

Closed andreiw closed 6 months ago

andreiw commented 6 months ago

The twitter crash had Synchronous Exception at 0xf14fc7e8, which with image base 0xf14f5000 corresponds to offset 0x77E8. This looks like data (no int3 after ret). The region is protected from execution, but MUA is not claiming it.

The strange thing is how anything in the image could have been invoked if it failed to start. Here's a theory: a different image was loaded after the the NVMe driver failed to load, but MUA didn't clean up the protection mappings.

    7792:       75 4c                   jne    0x77e0
    7794:       4c 89 44 24 30          mov    %r8,0x30(%rsp)
    7799:       4c 8d 0d b8 0e 00 00    lea    0xeb8(%rip),%r9        # 0x8658
    77a0:       48 89 54 24 28          mov    %rdx,0x28(%rsp)
    77a5:       4c 8d 05 fc 2c 00 00    lea    0x2cfc(%rip),%r8        # 0xa4a8
    77ac:       48 89 4c 24 20          mov    %rcx,0x20(%rsp)
    77b1:       ba 00 02 00 00          mov    $0x200,%edx
    77b6:       48 8d 4c 24 40          lea    0x40(%rsp),%rcx
    77bb:       e8 a8 ef ff ff          call   0x6768
    77c0:       48 8b 05 39 37 00 00    mov    0x3739(%rip),%rax        # 0xaf00
    77c7:       48 85 c0                test   %rax,%rax
    77ca:       74 14                   je     0x77e0
    77cc:       48 8b 40 40             mov    0x40(%rax),%rax
    77d0:       48 85 c0                test   %rax,%rax
    77d3:       74 0b                   je     0x77e0
    77d5:       48 8d 54 24 40          lea    0x40(%rsp),%rdx
    77da:       48 8b c8                mov    %rax,%rcx
    77dd:       ff 50 08                call   *0x8(%rax)
    77e0:       48 81 c4 48 02 00 00    add    $0x248,%rsp
    77e7:       c3                      ret
    77e8:       c6 05 b9 36 00 00 01    movb   $0x1,0x36b9(%rip)        # 0xaea8
    77ef:       c3                      ret

Originally posted by @andreiw in https://github.com/intel/MultiArchUefiPkg/issues/48#issuecomment-2093165553

andreiw commented 6 months ago

GLt0n6CaoAEiFyd

andreiw commented 6 months ago

Pretty sure the issue is with ImageProtocolUnregister calling gCpu->SetMemoryAttributes instead of gCpu->ClearMemoryAttributes. Whoops. @bcran

andreiw commented 6 months ago

Agh, ClearMemoryAttributes is not in gCpu, it's part of the memory attributes protocol.

andreiw commented 6 months ago

Use gCpu->SetMemoryAttributes with 0 is bad regardless. Either use the memory attributes protocol, which just modifies the underlying attrs. Or go via gDS->SetMemorySpaceAttributes.

andreiw commented 6 months ago

...of course only Arm implements EFI_MEMORY_ATTRIBUTE_PROTOCOL (at this time). Moving on...

andreiw commented 6 months ago

Needs more experimentation in an Arm environment. It should be enough to load an emulated app, have it execute, unload and then load a native app in the same space.

andreiw commented 6 months ago

Weird, I can't seem to repro. @bcran, dumb question, but you're using the regular ArmPkg CpuDxe right?

andreiw commented 6 months ago

I've tried forcing the ProtectUefiImageCommon failure path due to section alignment.... no dice. I still see ArmSetMemoryAttributes getting invoked.

Can you instrument the the ArmSetMemoryAttributes path and get a boot log?

--- a/ArmPkg/Drivers/CpuDxe/CpuMmuCommon.c
+++ b/ArmPkg/Drivers/CpuDxe/CpuMmuCommon.c
@@ -217,8 +217,12 @@ CpuSetMemoryAttributes (
   if (EFI_ERROR (Status) || (RegionArmAttributes != ArmAttributes) ||
       ((BaseAddress + Length) > (RegionBaseAddress + RegionLength)))
   {
+    DEBUG ((DEBUG_ERROR, "really applying attrs 0x%lx over 0x%lx - 0x%lx\n",
+            EfiAttributes, BaseAddress, Length));
     return ArmSetMemoryAttributes (BaseAddress, Length, EfiAttributes, 0);
   } else {
+    DEBUG ((DEBUG_ERROR, "skip applying attrs 0x%lx over 0x%lx - 0x%lx\n",
+            EfiAttributes, BaseAddress, Length));
     return EFI_SUCCESS;
   }
 }
andreiw commented 6 months ago

Ah... I misread this.

It wasn't failing at the first instruction in the entry point. It did actually register an event, which leaked when the image crashed. This is trivial to detect if building with MAU_WRAPPED_ENTRY_POINTS. Perhaps this is a good reason to do so.

andreiw commented 6 months ago

Closing this issue.