unicorn-engine / unicorn

Unicorn CPU emulator framework (ARM, AArch64, M68K, Mips, Sparc, PowerPC, RiscV, S390x, TriCore, X86)
http://www.unicorn-engine.org
GNU General Public License v2.0
7.33k stars 1.31k forks source link

Memory hooks cause incorrect emulation of the carry flag for the SAR instrution on x86_64 #1933

Closed epuni closed 2 months ago

epuni commented 2 months ago

While emulating obfuscated code i encountered this snippet (cleaned up manually to a minimal test case)

mov rsi, 0x32
mov rcx, 0x3d
push rcx
mov rcx, 0xffffffff
sar qword ptr [rsp], cl
adc sil, cl

The result of this should be RSI = 0x31, however when a memory hook is set the sar instruction sets the carry flag causing the value to be incorrect.

The issue seems to be related to the alignment of the instructions, as moving around the bytes fixes the issue, see the example below.

Repro code:

from keystone import *
from capstone import *
from unicorn import *
from unicorn.x86_const import *

CODE_ADDR = 0x140000000
STACK_LIMIT = 0x7f0000000
STACK_BASE  = 0x7f0010000

ks = keystone.Ks(keystone.KS_ARCH_X86, keystone.KS_MODE_64)
code, _ = ks.asm("""
mov rsi, 0x32
mov rcx, 0x3d
push rcx
mov rcx, 0xffffffff
sar qword ptr [rsp], cl
adc sil, cl
""", CODE_ADDR)

uc = Uc(UC_ARCH_X86, UC_MODE_64)
cs = Cs(CS_ARCH_X86, CS_MODE_64)

uc.mem_map(STACK_LIMIT, STACK_BASE - STACK_LIMIT)
uc.reg_write(UC_X86_REG_RSP, STACK_BASE-0x800)

uc.mem_map(CODE_ADDR, 0x1000, UC_PROT_ALL)
uc.mem_write(CODE_ADDR, bytes(code))

def _code_hook(uc: Uc, address, size, obj):
    _,_,op, instr = next(cs.disasm_lite(uc.mem_read(address, size), address))
    print(f"0x{address:x} | {op} {instr}")

uc.hook_add(UC_HOOK_CODE, _code_hook, uc)

def _mem_hook(uc: Uc, access, address, size, value, obj):
    rip = uc.reg_read(UC_X86_REG_RIP)
    print(f"   0x{rip:x} access {access} at 0x{address:x} size 0x{size:x}")

# Comment this to fix the issue
uc.hook_add(UC_HOOK_MEM_READ | UC_HOOK_MEM_WRITE, _mem_hook)

uc.emu_start(CODE_ADDR, CODE_ADDR + len(code))

rsi = uc.reg_read(UC_X86_REG_RSI)
if rsi != 0x31:
    print("BUG !")

print(f"RSI = {rsi:x}")

Running this code on my ubuntu machine with python 3.10.12 and latest unicorn 2.0.1.post1 prints RSI = 0x32.

Observations:

Is this a known issue and is there a workaround ?

wtdcode commented 2 months ago

Have you tried our dev branch? This should be #1717

epuni commented 2 months ago

Thank you, building from the dev branch fixed the issue