JonathanSalwan / Triton

Triton is a dynamic binary analysis library. Build your own program analysis tools, automate your reverse engineering, perform software verification or just emulate code.
https://triton-library.github.io
Apache License 2.0
3.4k stars 524 forks source link

ARM64 invalid load of address using LDR instruction #1265

Closed EljakimHerrewijnen closed 10 months ago

EljakimHerrewijnen commented 11 months ago

The LDR instruction behaves differently in Triton than in Unicorn. When loading a value from a memory address ,using the LDR instruction, Triton loads the address in the operand instead of the data in the target address.

Triton implements the following LDR instructions in the source code:

/* LDR <Xt>, [<Xn|SP>], #<simm> */
/* LDR <Xt>, [<Xn|SP>, #<simm>]! */

I think we are missing the load of a value at an address:

/* LDR <Xt>, 0xffffff0250 */

I created example code that runs in Unicorn but crashes in Triton. Is this expected behaviour?

from triton import *
from unicorn import *
from unicorn.arm64_const import *
from keystone import *
import struct

ks = Ks(KS_ARCH_ARM64, KS_MODE_LITTLE_ENDIAN)
base = 0xffff0000

'''
This shellcode should load an address from 0xffff0024 into x0 and branch to it. 
However, in Triton the address that is loaded is not the address at that memory location, 
but instead the address provided in the operand.
This crashes Triton due to branching to a memory address instead of an instruction
'''
shellcode = f"""
        mov        x2,xzr \n
        mov        x3,xzr \n
        ldr        x0, 0xffff0024 \n
        br         x0 \n
        ret \n
        mov        x2, 0x77 \n
        mov        x3 ,0x77 \n
        mov        x4, 0x77 \n
        ret \n
"""

shellcode_bin = ks.asm(shellcode, base, as_bytes=True)[0]
shellcode_bin += struct.pack("<Q", 0xffff0014)
end_address = base + len(shellcode_bin)

def emulate_uc():
    uc = Uc(UC_ARCH_ARM64, UC_MODE_LITTLE_ENDIAN)
    uc.mem_map(0xffff0000, 0x1000, UC_PROT_ALL)
    uc.mem_write(0xffff0000, shellcode_bin)
    uc.emu_start(0xffff0000, 0)

    assert uc.reg_read(UC_ARM64_REG_X2) == 0x77
    print(f"ok, pc={hex(uc.reg_read(UC_ARM64_REG_PC))} x2={uc.reg_read(UC_ARM64_REG_X2)}")

def emulate_triton():
    ctx = TritonContext(ARCH.AARCH64)
    ctx.setConcreteMemoryAreaValue(base, shellcode_bin)
    ctx.setConcreteRegisterValue(ctx.registers.pc, base)

    while ctx.getConcreteRegisterValue(ctx.registers.pc) != end_address:
        print(f"at: {hex(ctx.getConcreteRegisterValue(ctx.registers.pc))}")

        instruction = ctx.getConcreteMemoryAreaValue(ctx.getConcreteRegisterValue(ctx.registers.pc), 4)
        inst = Instruction()
        inst.setOpcode(instruction)
        ctx.processing(inst)

        ctx.concretizeRegister(ctx.registers.pc)

    assert ctx.getConcreteRegisterValue(ctx.registers.x2) == 0x77

if __name__ == "__main__":
    emulate_uc()
    emulate_triton()

Or am I missing something obvious? :)

JonathanSalwan commented 11 months ago

Can you give me the opcode of the ldr x0, 0xffff0024 instruction?

JonathanSalwan commented 11 months ago

nvm I got it. I think you're right we did not implemented LDR literal.

EljakimHerrewijnen commented 11 months ago

Opcode should be: 0x58000000.

Yes, I think so. I tried to implement it but got stuck.