unicorn-engine / unicorn

Unicorn CPU emulator framework (ARM, AArch64, M68K, Mips, Sparc, PowerPC, RiscV, S390x, TriCore, X86)
http://www.unicorn-engine.org
GNU General Public License v2.0
7.67k stars 1.35k forks source link

Ignoring a Branchs/Handling Invalid Memory Access Handling Gracefully #1949

Open pinwhell opened 7 months ago

pinwhell commented 7 months ago

say for arm64 how to handle branches outside of a region, assuming all branchs set the link register, how to gracefully simply come back without Halting the entire emulatino with an exception? also how to ignore all memory access attempt, and return 0 instead? can unicorn let me do this in python?

wtdcode commented 6 months ago

I fail to get your point. Could you show a few code snippets to illustrate your use case better?

pinwhell commented 6 months ago

from unicorn import *
from unicorn.arm_const import *

def hook_mem_invalid(uc, access, address, size, value, user_data):
    # Allocate a new page of memory at the accessed address

    if access == UC_MEM_READ_UNMAPPED:
        print("Remapping 0x{:X}".format(address & 0xFFFFF000))
        uc.mem_map(address & 0xFFFFF000, 0x1000)
        uc.mem_write(address & 0xFFFFF000, bytes.fromhex("70 00 20 E1")) # BKPT #0

    return True

def hook_intr(uc, int, user_data):
    print("Interrupt {} Detected".format(int))
    print("Assuming Remap BKPT Source")
    print("Jmping Back to LR")

    uc.reg_write(UC_ARM_REG_PC, uc.reg_read(UC_ARM_REG_LR))

    return True

def emulate_code():
    # Define the ARM code to be emulated
    ARM_CODE =  bytes.fromhex("01 06 A0 E3")  # MOV r0, 0x100000
    ARM_CODE += bytes.fromhex("00 10 90 E5")  # LDR r1, [r0] ; Load contents of memory at address stored in r0 into r1
    ARM_CODE += bytes.fromhex("30 FF 2F E1")  # BLX r0
    ARM_CODE += bytes.fromhex("02 06 A0 E3")  # MOV r0, 0x200000

    # Initialize Unicorn emulator
    uc = Uc(UC_ARCH_ARM, UC_MODE_ARM)

    # Memory region for code
    ADDRESS = 0x1000
    uc.mem_map(ADDRESS, 0x1000)
    uc.mem_write(ADDRESS, ARM_CODE)

    # Set up hook for invalid memory access
    uc.hook_add(UC_HOOK_MEM_INVALID, hook_mem_invalid)

    # Set up hook for interrupts
    uc.hook_add(UC_HOOK_INTR, hook_intr)

    # Emulate code
    try:
        # Emulate code starting at address 0x1000
        uc.emu_start(ADDRESS, ADDRESS + len(ARM_CODE))

    except UcError as e:
        print("Error:", e)

    # Log register state after emulation
    print("Register state after emulation:")
    index = 0
    for reg in range(UC_ARM_REG_R0, UC_ARM_REG_R12 + 1):
        print("  R{}: 0x{:x}".format(index, uc.reg_read(reg)))
        index += 1
    print("  LR: 0x{:x}".format(uc.reg_read(UC_ARM_REG_LR)))
    print("  PC: 0x{:x}".format(uc.reg_read(UC_ARM_REG_PC)))

if __name__ == "__main__":
    emulate_code()

take this as example, it does what i initially wanted, gives this output:

Remapping 0x100000
Interrupt 7 Detected
Assuming Remap BKPT Source
Jmping Back to LR
Register state after emulation:
  R0: 0x200000
  R1: 0xe1200070
...
  LR: 0x100c
  PC: 0x1010

Process finished with exit code 0

it successfully reached

MOV r0, 0x200000

ignoring the branch

it is currently constrained to ARM, is there any way to make it more general?

pinwhell commented 6 months ago

not triggering the read fault with LDR

    # Define the ARM code to be emulated
    ARM_CODE =  bytes.fromhex("01 06 A0 E3")  # MOV r0, 0x100000
    # ARM_CODE += bytes.fromhex("00 10 90 E5")  # LDR r1, [r0] ; Load contents of memory at address stored in r0 into r1
    ARM_CODE += bytes.fromhex("30 FF 2F E1")  # BLX r0
    ARM_CODE += bytes.fromhex("02 06 A0 E3")  # MOV r0, 0x200000

and handling for all faults

def hook_mem_invalid(uc, access, address, size, value, user_data):
    # Allocate a new page of memory at the accessed address

    print("Remapping 0x{:X}".format(address & 0xFFFFF000))
    uc.mem_map(address & 0xFFFFF000, 0x1000)
    uc.mem_write(address & 0xFFFFF000, bytes.fromhex("70 00 20 E1"))  # BKPT #0

    return True

also works

  R0: 0x200000
...
  LR: 0x1008
  PC: 0x100c

Process finished with exit code 0
pinwhell commented 6 months ago

I fail to get your point. Could you show a few code snippets to illustrate your use case better?

so overall, the initial question in other, words, is there any way to generalize this behavior & convert it to platform agnostic, is there any way to simply hook branches and cancel them all? without relying on hooking instructions and checking for specific instructions ...