JonathanSalwan / Triton

Triton is a dynamic binary analysis library. Build your own program analysis tools, automate your reverse engineering, perform software verification or just emulate code.
https://triton-library.github.io
Apache License 2.0
3.4k stars 524 forks source link

liftToLLVM and instruction.getDisassembly is not work for arm64 #1249

Closed FANGG3 closed 1 year ago

FANGG3 commented 1 year ago

file arch: aarch64 LittleEndian my code

unsigned int target_function(unsigned int n)
__attribute((__annotate__(("fla"))))
__attribute((noinline))
{
  unsigned int mod = n % 4;
  unsigned int result = 0;

  if (mod == 0) result = (n | 0xBAAAD0BF) * (2 ^ n);

  else if (mod == 1) result = (n & 0xBAAAD0BF) * (3 + n);

  else if (mod == 2) result = (n ^ 0xBAAAD0BF) * (4 | n);

  else result = (n + 0xBAAAD0BF) * (5 & n);

  return result;
}
import binascii

import lief
from binaryninja import *
import binaryninja as bn
from triton import *
def lifting2llvm(ctx):
    predicate = ctx.getPathPredicate()
    print(predicate)
    M = ctx.liftToLLVM(predicate, fname="mars_analytica", optimize=False)
    print('[+] Going further than just solving the challenge')
    print('[+] Lifting the path predicate to LLVM')
    print()
    print(M)
    return

ctx = TritonContext()
ctx.setArchitecture(ARCH.AARCH64)

ctx.setMode(MODE.ALIGNED_MEMORY, True)
ctx.setMode(MODE.CONSTANT_FOLDING, True)
ctx.setMode(MODE.ONLY_ON_SYMBOLIZED, True)
ctx.setMode(MODE.AST_OPTIMIZATIONS, True)
binary = lief.parse(bv.file.filename)
#ctx.concretizeAllMemory()
#ctx.concretizeAllRegister()
phdrs = bv.segments
for seg in phdrs:
    size = seg.data_length
    vaddr = seg.start
    content = bv.read(vaddr, size)
    print('[+] Loading 0x%06x - 0x%06x' % (vaddr, vaddr + size))
    ctx.setConcreteMemoryAreaValue(vaddr, list(content))

BASE_PLT = 0x10000000
BASE_ARGV = 0x20000000
BASE_STACK = 0x9fffffff
pc = 0x698
ctx.setConcreteRegisterValue(ctx.registers.sp, BASE_STACK)
ctx.setConcreteRegisterValue(ctx.registers.pc, 0x698)
ctx.setConcreteRegisterValue(ctx.registers.x0,3)

print('[+] Starting emulation.')
while pc:
    opcodes = ctx.getConcreteMemoryAreaValue(pc, 4)
    instruction = Instruction()
    instruction.setOpcode(opcodes)
    instruction.setAddress(pc)

    print("%08x %s " % (pc, bv.get_disassembly(pc)))
    ctx.processing(instruction)
    print(ctx.getConcreteRegisterValue(ctx.registers.x0))
    pc = ctx.getConcreteRegisterValue(ctx.registers.pc)
print('[+] Emulation done.')

lifting2llvm(ctx)

I got this. 0x818: <not disassembled>

; ModuleID = 'tritonModule'
source_filename = "tritonModule"

define i1 @mars_analytica() {
entry:
  ret i1 true
}
JonathanSalwan commented 1 year ago

Can you provide the opcode that can not be disassembled?

JonathanSalwan commented 1 year ago

Your problem is that it's not AArch64 but ARM that you are analyzing. The opcode b'\x01\x00P\xe1' is not valid for AArch64.

$ cstool arm64 010050e1
ERROR: invalid assembly code

$ cstool arm 010050e1
 0  01 00 50 e1  cmp    r0, r1
FANGG3 commented 1 year ago

Can you provide the opcode that can not be disassembled?

sorry,that is another sample,here is the right opcodes

correct: 00000698 stp     x20, x19, [sp, #-0x10]! 
triton:  0x698: <not disassembled>
b'\xf4O\xbf\xa9'
correct: 0000069c sub     sp, sp, #0x10 
triton:  0x69c: <not disassembled>
b'\xffC\x00\xd1'
correct: 000006a0 and     w18, w0, #0x3 
triton:  0x6a0: <not disassembled>
b'\x12\x04\x00\x12'
correct: 000006a4 mov     w1, #0xbaaa0000 
triton:  0x6a4: <not disassembled>
b'AU\xb7R'
correct: 000006a8 movk    w1, #0xd0bf 
triton:  0x6a8: <not disassembled>
b'\xe1\x17\x9ar'

:< I don't know if it's my version problem

triton.VERSION.BUILD 1589
triton.VERSION.MAJOR 1
triton.VERSION.MINOR 0
triton.VERSION.Z3_INTERFACE True
triton.VERSION.BITWUZLA_INTERFACE True
triton.VERSION.LLVM_INTERFACE True
JonathanSalwan commented 1 year ago

They are classical arm64 instructions. I've verified and it's working on v1.0:

>>> ctx = TritonContext(ARCH.AARCH64)
>>> inst = Instruction(b'\xf4\x30\xbf\xa9')
>>> ctx.processing(inst)
0
>>> print(inst)
0x0: stp x20, x12, [x7, #-0x10]!

Probably something wrong with your snippet =/

FANGG3 commented 1 year ago

it is a small misstake. I checked my code and it should be:

    instruction = Instruction()
    instruction.setAddress(pc)
    instruction.setOpcode(opcodes)
    # print(instruction) #not here
    ctx.processing(instruction)
    print(instruction)

but,liftToLLVM still not work . XD

    predicate = ctx.getPathPredicate()
    M = ctx.liftToLLVM(predicate, fname="mars_analytica", optimize=True)
JonathanSalwan commented 1 year ago

Can you show me what liftToLLVM returns? Because you have to symbolize something if you want to craft an LLVM expression (probably n in your case).

FANGG3 commented 1 year ago

the return is (= (_ bv1 1) (_ bv1 1)) finally ,this is my fault,I should setctx.symbolizeRegister(ctx.registers.x0,"n")

thans for you help ^_^