JonathanSalwan / Triton

Triton is a dynamic binary analysis library. Build your own program analysis tools, automate your reverse engineering, perform software verification or just emulate code.
https://triton-library.github.io
Apache License 2.0
3.4k stars 525 forks source link

Allow the possibility to add padding when simplifying a block #1141

Closed JonathanSalwan closed 2 years ago

JonathanSalwan commented 2 years ago

When simplifying a block, we may insert a wrong jmp at the end if the jmp has a relative offset. As we do not have an assembler inside this project, we can not craft a new jmp instruction with the correct offset. So we provide a new option called padding which will insert nop instructions in the simplified block. Thus, the addresses of instructions are kept the same between the input and the output blocks. Example:

Input block (obfuscated):

>>> self.ctx.disassembly(block, 0x10000)
>>> print(block)
0x10000: mov rsp, rbp
0x10003: pop r9
0x10005: pop r13
0x10007: pop r14
0x10009: test r8w, r13w
0x1000d: bt r10d, esp
0x10011: pop rbx
0x10012: cbw
0x10014: shl r11b, cl
0x10017: pop r12
0x10019: pop r10
0x1001b: xor r8w, 0xbe51
0x10021: stc
0x10022: pop rax
0x10023: bts r11w, si
0x10028: pop rdi
0x10029: neg r15b
0x1002c: bsf r8w, r12w
0x10031: pop rdx
0x10032: btc bp, 0xc
0x10037: or r11d, r10d
0x1003a: pop r11
0x1003c: add esi, 0xff7f0f83
0x10042: sbb rbp, 0x7de7286d
0x10049: pop rcx
0x1004a: pop rbp
0x1004b: cmovbe r8w, r13w
0x10050: dec r8w
0x10054: movsx r15w, r13b
0x10059: pop r8
0x1005b: shr sil, cl
0x1005e: setnp sil
0x10062: pop r15
0x10064: movsx rsi, r14w
0x10068: clc
0x10069: popfq
0x1006a: seto sil
0x1006e: movzx rsi, r9w
0x10072: pop rsi
0x10073: ret

Output block (deobfuscated):

>>> sblock = self.ctx.simplify(block, padding=True)
>>> self.ctx.disassembly(sblock, 0x10000)
>>> print(sblock)
0x10000: mov rsp, rbp
0x10003: pop r9
0x10005: pop r13
0x10007: pop r14
0x10009: nop
0x1000a: nop
0x1000b: nop
0x1000c: nop
0x1000d: nop
0x1000e: nop
0x1000f: nop
0x10010: nop
0x10011: pop rbx
0x10012: nop
0x10013: nop
0x10014: nop
0x10015: nop
0x10016: nop
0x10017: pop r12
0x10019: pop r10
0x1001b: nop
0x1001c: nop
0x1001d: nop
0x1001e: nop
0x1001f: nop
0x10020: nop
0x10021: nop
0x10022: pop rax
0x10023: nop
0x10024: nop
0x10025: nop
0x10026: nop
0x10027: nop
0x10028: pop rdi
0x10029: nop
0x1002a: nop
0x1002b: nop
0x1002c: nop
0x1002d: nop
0x1002e: nop
0x1002f: nop
0x10030: nop
0x10031: pop rdx
0x10032: nop
0x10033: nop
0x10034: nop
0x10035: nop
0x10036: nop
0x10037: nop
0x10038: nop
0x10039: nop
0x1003a: pop r11
0x1003c: nop
0x1003d: nop
0x1003e: nop
0x1003f: nop
0x10040: nop
0x10041: nop
0x10042: nop
0x10043: nop
0x10044: nop
0x10045: nop
0x10046: nop
0x10047: nop
0x10048: nop
0x10049: pop rcx
0x1004a: pop rbp
0x1004b: nop
0x1004c: nop
0x1004d: nop
0x1004e: nop
0x1004f: nop
0x10050: nop
0x10051: nop
0x10052: nop
0x10053: nop
0x10054: nop
0x10055: nop
0x10056: nop
0x10057: nop
0x10058: nop
0x10059: pop r8
0x1005b: nop
0x1005c: nop
0x1005d: nop
0x1005e: nop
0x1005f: nop
0x10060: nop
0x10061: nop
0x10062: pop r15
0x10064: nop
0x10065: nop
0x10066: nop
0x10067: nop
0x10068: nop
0x10069: popfq
0x1006a: nop
0x1006b: nop
0x1006c: nop
0x1006d: nop
0x1006e: nop
0x1006f: nop
0x10070: nop
0x10071: nop
0x10072: pop rsi
0x10073: ret

Note: With this padding option, we can easily patch a binary without breaking instructions offset.