Closed ghost closed 6 months ago
So it fails straight at the EntryPoint of the image due to an unsupported (by Unicorn) instruction: rdrand.
If I recall correctly, this wasn't the end of the trouble... but you had an actual crash during ExitBootServices. I'd want to fix that first, as that sounds like a MUA bug due to a failed image start.
00000000000002e0 <.text>:
2e0: 0f c7 f0 rdrand %eax
2e3: 72 04 jb 0x2e9
2e5: 48 31 c0 xor %rax,%rax
2e8: c3 ret
2e9: 66 89 01 mov %ax,(%rcx)
2ec: b8 01 00 00 00 mov $0x1,%eax
2f1: c3 ret
2f2: 0f c7 f0 rdrand %eax
2f5: 72 04 jb 0x2fb
2f7: 48 31 c0 xor %rax,%rax
2fa: c3 ret
2fb: 89 01 mov %eax,(%rcx)
2fd: b8 01 00 00 00 mov $0x1,%eax
302: c3 ret
303: 48 0f c7 f0 rdrand %rax
307: 72 04 jb 0x30d
309: 48 31 c0 xor %rax,%rax
30c: c3 ret
30d: 48 89 01 mov %rax,(%rcx)
310: b8 01 00 00 00 mov $0x1,%eax
The twitter crash had Synchronous Exception at 0xf14fc7e8, which with image base 0xf14f5000 corresponds to offset 0x77E8. This looks like data (no int3 after ret). The region is protected from execution, but MUA is not claiming it.
The strange thing is how anything in the image could have been invoked if it failed to start. Here's a theory: a different image was loaded after the the NVMe driver failed to load, but MUA didn't clean up the protection mappings.
7792: 75 4c jne 0x77e0
7794: 4c 89 44 24 30 mov %r8,0x30(%rsp)
7799: 4c 8d 0d b8 0e 00 00 lea 0xeb8(%rip),%r9 # 0x8658
77a0: 48 89 54 24 28 mov %rdx,0x28(%rsp)
77a5: 4c 8d 05 fc 2c 00 00 lea 0x2cfc(%rip),%r8 # 0xa4a8
77ac: 48 89 4c 24 20 mov %rcx,0x20(%rsp)
77b1: ba 00 02 00 00 mov $0x200,%edx
77b6: 48 8d 4c 24 40 lea 0x40(%rsp),%rcx
77bb: e8 a8 ef ff ff call 0x6768
77c0: 48 8b 05 39 37 00 00 mov 0x3739(%rip),%rax # 0xaf00
77c7: 48 85 c0 test %rax,%rax
77ca: 74 14 je 0x77e0
77cc: 48 8b 40 40 mov 0x40(%rax),%rax
77d0: 48 85 c0 test %rax,%rax
77d3: 74 0b je 0x77e0
77d5: 48 8d 54 24 40 lea 0x40(%rsp),%rdx
77da: 48 8b c8 mov %rax,%rcx
77dd: ff 50 08 call *0x8(%rax)
77e0: 48 81 c4 48 02 00 00 add $0x248,%rsp
77e7: c3 ret
77e8: c6 05 b9 36 00 00 01 movb $0x1,0x36b9(%rip) # 0xaea8
77ef: c3 ret
Ah I misread this... 0x48C is the entry point, and the driver did register an exitbootservices handler.
Okay, back to the rdrand issue.
This is so weird.
If the 'rdrand' insn succeeds, the NVMe driver blasts the value into a value formed by taking the ImageHandle & 0xffff. How the hell is this supposed to work?
I loaded up the driver in Ghidra.
With the following the driver loads. The rdrand insn has to return failure to not trigger a random crash in the driver. Presumably, on a real x86 system scribbling within the first 64k is somehow ok? Ugh.
diff --git a/qemu/target/i386/int_helper.c b/qemu/target/i386/int_helper.c
index 5dea08ab..3e6bce5f 100644
--- a/qemu/target/i386/int_helper.c
+++ b/qemu/target/i386/int_helper.c
@@ -476,6 +476,9 @@ target_ulong HELPER(rdrand)(CPUX86State *env)
{
target_ulong ret;
+ env->cc_src = 0;
+ return 0;
+
if (qemu_guest_getrandom(&ret, sizeof(ret)) < 0) {
// qemu_log_mask(LOG_UNIMP, "rdrand: Crypto failure: %s",
// error_get_pretty(err));
diff --git a/qemu/target/i386/unicorn.c b/qemu/target/i386/unicorn.c
index f10b70e2..ca755edf 100644
--- a/qemu/target/i386/unicorn.c
+++ b/qemu/target/i386/unicorn.c
@@ -73,7 +73,7 @@ void x86_reg_reset(struct uc_struct *uc)
CPUID_FXSR | CPUID_SSE | CPUID_CLFLUSH;
env->features[FEAT_1_ECX] = CPUID_EXT_SSSE3 | CPUID_EXT_SSE41 |
CPUID_EXT_SSE42 | CPUID_EXT_AES |
- CPUID_EXT_CX16;
+ CPUID_EXT_CX16 | CPUID_EXT_RDRAND;
env->features[FEAT_8000_0001_EDX] = CPUID_EXT2_3DNOW | CPUID_EXT2_RDTSCP;
env->features[FEAT_8000_0001_ECX] = CPUID_EXT3_LAHF_LM | CPUID_EXT3_ABM |
CPUID_EXT3_SKINIT | CPUID_EXT3_CR8LEG;
Well more poking around the driver and I still can't tell what it meant to accomplish with rdrand usage, but I'm no Ghidra/efiseek whiz.
I'll check in a "fix" of sorts that should help, by implementing an rdrand insn that always fails.
Another mechanism could be to simply ignore reads/writes to bottom 64k, going on the theory this isn't the first or the last bit of code that accidentally scribbles something around address 0. Something like that could be opt-in, but enabled by default for running x86 code.
Thanks! I've verified it works now.
When loading the PCIe OptROM for a Micron 7300 MAX U.2 drive on AArch64 (an ADLINK Ampere Altra Dev Kit) I get an emulation failure, as shown in the attached screenshot. I'm using MultiArchUefiPkg commit 7972cdf844b4a4c22bb1c4f4b8d13e427bc9a2e0 from Feb 6th, 2024.
I extracted the optrom image using PciRom and attached it. The information PciRom shows about it is:
00-UEFI.rom.gz