JonathanSalwan / Triton

Triton is a dynamic binary analysis library. Build your own program analysis tools, automate your reverse engineering, perform software verification or just emulate code.
https://triton-library.github.io
Apache License 2.0
3.5k stars 533 forks source link

Conditional CMOVss should not invoke operand callbacks if the predicate is false #1166

Open hexpell opened 2 years ago

hexpell commented 2 years ago

Illustrated below, cmovne eax,DWORD PTR ds:0x100 is skipped because the condition is false, yet Triton still call the operand callbacks.

import opcode
from triton import *

ctx = TritonContext()
ctx.setArchitecture(ARCH.X86)
ctx.setMode(MODE.ALIGNED_MEMORY, True)

function = {
    0x0: b"\x84\xc9",   #  test   cl,cl
    0x2: b"\x0f\x45\x05\x00\x01\x00\x00",   #  cmovne eax,DWORD PTR ds:0x100
}

def mem_read_cb(ctx, mem):
    assert False, 'should not invoke memory read callback if condition mov predicate is false'

ctx.setConcreteMemoryValue(MemoryAccess(0x100, 4), 0x1234)
ctx.addCallback(CALLBACK.GET_CONCRETE_MEMORY_VALUE, mem_read_cb)

for pc in function:
    inst = Instruction(pc, function[pc])
    ctx.processing(inst)

assert ctx.getConcreteRegisterValue(ctx.registers.eax) == 0
JonathanSalwan commented 2 years ago

Thanks for all those feedback <3, I will handle them when i come back at home (~2 or 3 weeks).

JonathanSalwan commented 2 years ago

yet Triton still call the operand callbacks.

In a concrete point of view I agree. But if we want to provide a correct symbolic expression, we need to know what values will be read according to the flag (0x1234 if CF is false else 0x0000).

#
# Output:
# 0x0: cmovne eax, dword ptr [0x100]
# ref_0 = (((((0x0) << 8 | 0x0) << 8 | 0x12) << 8 | 0x34) if (0x0 == 0x0) else 0x0) # CMOVNE operation
# ref_1 = 0x7 # Program Counter
# 

import opcode
from triton import *

ctx = TritonContext()
ctx.setArchitecture(ARCH.X86)

ctx.setConcreteMemoryValue(MemoryAccess(0x100, 4), 0x1234)
ctx.setAstRepresentationMode(AST_REPRESENTATION.PYTHON)

inst = Instruction(b"\x0f\x45\x05\x00\x01\x00\x00") # cmovne eax,DWORD PTR ds:0x100
ctx.processing(inst)
print(inst)
for se in inst.getSymbolicExpressions():
    print(se)
hexpell commented 2 years ago

Yes. I thought about that as well. So my proposal is like this.

https://github.com/hexpell/Triton/commit/b38a0b1b48afb727150e9b469806ec9712a2c40b?diff=split#diff-d8355422259518dcc03f4b9663c15b8924db50fc470151fac362d2696074b6ddR4208

Before the conditional MOV triggers the memory callback, set the inst.setConditionTaken() first, so that in the callback I can do this:

def isConditionalMOV(inst: Instruction):
    return inst.getType() in [
        OPCODE.X86.CMOVA,
        OPCODE.X86.CMOVAE,
        OPCODE.X86.CMOVB,
        OPCODE.X86.CMOVBE,
        OPCODE.X86.FCMOVBE,
        OPCODE.X86.FCMOVB,
        OPCODE.X86.CMOVE,
        OPCODE.X86.FCMOVE,
        OPCODE.X86.CMOVG,
        OPCODE.X86.CMOVGE,
        OPCODE.X86.CMOVL,
        OPCODE.X86.CMOVLE,
        OPCODE.X86.FCMOVNBE,
        OPCODE.X86.FCMOVNB,
        OPCODE.X86.CMOVNE,
        OPCODE.X86.FCMOVNE,
        OPCODE.X86.CMOVNO,
        OPCODE.X86.CMOVNP,
        OPCODE.X86.FCMOVNU,
        OPCODE.X86.CMOVNS,
        OPCODE.X86.CMOVO,
        OPCODE.X86.CMOVP,
        OPCODE.X86.FCMOVU,
        OPCODE.X86.CMOVS,
    ]

def mem_read_cb(ctx, mem):
    if isConditionalMOV(inst) and not inst.isConditionTaken():
        return
    assert False, 'should not invoke memory read callback if condition mov predicate is false'

CMOVcc family is probably the only instructions that need to be handled this way, not sure if there are other instructions like this.