JonathanSalwan / Triton

Triton is a dynamic binary analysis library. Build your own program analysis tools, automate your reverse engineering, perform software verification or just emulate code.
https://triton-library.github.io
Apache License 2.0
3.4k stars 524 forks source link

Reduce usage of concrete evaluation in x86 semantics #1241

Open Colton1skees opened 1 year ago

Colton1skees commented 1 year ago

Hi @JonathanSalwan,

In x86semantics.cpp, there are 46 instruction handlers where the emitted ASTs are dependent upon a concrete value retrieved through .evaluate():

There are some similar cases(push/pushfq/pop/popfq, fxsave/fxrstor, ret, etc) involving symbolic memory, but IMO they're out of the scope of this issue since Triton does not have STORE/LOAD ast nodes.

One step towards https://github.com/JonathanSalwan/Triton/issues/473 would be to replace .evaluate() usages with code to emit AST nodes which depend on a symbolic value. Any objections to me opening a PR for this?

You can break the usages of .evaluate() into four categories:

  1. String instructions (e.g. scasb) where the concrete value of cx is used to determine whether any nodes should be emitted
  2. Exception raisers(e.g. div_s) where the exception variable is set for cases where x86 would raise an exception
  3. General cases(e.g. pinsrb_s, vextracti128_s) where I'm not sure why a concrete value is used instead of a symbolic value
  4. Conditional undefines of a register(e.g. rol_s where of is set to an undefined value if the src operand is greater than 1)

For these categories:

  1. An ITE node (e.g. 'dst = ite(cx == 0, original_value, new_value)` could be used instead
  2. Exception raising is probably out of the scope of this issue, since the IR has no way of modeling exception raising. I would leave this as is.
  3. A symbolic value could be used
  4. An ITE node(e.g. of = ite(src > 1, undef, of)) could be used
SweetVishnya commented 1 year ago

Won't this break the dynamic symbolic execution approach where we make the constraints based on the path that was actually executed?

SweetVishnya commented 1 year ago

Anyway, if we decide to proceed with the PR, we have local DSE benchmarks to measure symbolic execution accuracy and speed.

Colton1skees commented 1 year ago

Won't this break the dynamic symbolic execution approach where we make the constraints based on the path that was actually executed?

@SweetVishnya Can you give a case example? The changes to handling undefined flags should be fine I think - the only case I can see a potential problem with is the string related instructions

SweetVishnya commented 1 year ago

Won't this break the dynamic symbolic execution approach where we make the constraints based on the path that was actually executed?

@SweetVishnya Can you give a case example? The changes to handling undefined flags should be fine I think - the only case I can see a potential problem with is the string related instructions

Yeah, now Triton decides the number of times to execute REP string instructions.

SweetVishnya commented 1 year ago

Maybe, we should split this PR by instruction groups. I can benchmark symbolic execution accuracy on our side. And we can merge succeeding PRs one by one.

Colton1skees commented 1 year ago

@SweetVishnya Take stosb as an example.

With the change I suggested, this code:

/* Create symbolic expression */
auto expr1 = this->symbolicEngine->createSymbolicExpression(inst, node1, dst, "STOSD operation");
auto expr2 = this->symbolicEngine->createSymbolicExpression(inst, node2, index, "Index operation");

would be transformed into something along the lines of this(in pseudo code)

/* Create symbolic expression */
auto src1 = ITE(cx != 0,  node1, getOperandAst(dst))
auto src2 = ITE(cx != 0,  node2, getOperandAst(index))

auto expr1 = this->symbolicEngine->createSymbolicExpression(inst, src1, dst, "STOSD operation");
auto expr2 = this->symbolicEngine->createSymbolicExpression(inst, src2, index, "Index operation");

Essentially both destinations are only being assigned a new value if the counter(cx) is not set to zero. The logic for computing the next RIP should remain unchanged. With this context, do you still think this may cause problems?

SweetVishnya commented 1 year ago

@Colton1skees, I do not see the problem here. But you never know what break symbolic reasoning in real world. We may try to benchmark all string instructions in a separate PR. If everything is OK, we can merge it. More info about our benchmarking can be found in Evaluation section.

JonathanSalwan commented 1 year ago

So far for this stosb instruction, I see no problem. What I suggest is to make one PR per each instruction with a bunch of unit tests to be sure we do not break anything. So that it's easy for me to merge ones that are working perfectly and to discuss ones that could break things.

JonathanSalwan commented 1 year ago

In a general point of view, removing all evaluate() from the instruction semantics could be a step forward to solve #473.

SweetVishnya commented 1 year ago

So far for this stosb instruction, I see no problem. What I suggest is to make one PR per each instruction with a bunch of unit tests to be sure we do not break anything. So that it's easy for me to merge ones that are working perfectly and to discuss ones that could break things.

One PR for each instruction can be an overkill. Maybe, we can group them somehow?

JonathanSalwan commented 1 year ago

Yeah I think grouping them would be ok :)

SweetVishnya commented 1 year ago

Moreover, running one benchmark requires one night of a time ;)

Colton1skees commented 1 year ago

Maybe, we should split this PR by instruction groups.

Agreed, #1/#3/#4 should be split into separate. grouped PRs/

But you never know what break symbolic reasoning in real world. We may try to benchmark all string instructions in a separate PR. If everything is OK, we can merge it. More info about our benchmarking can be found in Evaluation section.

Understood. If I have time to PR this, I'll try to put together some relatively exhaustive tests.

Also, is there any existing DSE accuracy benchmark which I can run locally? Afaik Sydr is closed source

SweetVishnya commented 1 year ago

@Colton1skees, unfortunately our benchmark is closed source( because it requires Sydr. We just open source the benchmark binaries.

Colton1skees commented 1 year ago

In a general point of view, removing all evaluate() from the instruction semantics could be a step forward to solve https://github.com/JonathanSalwan/Triton/issues/473.

I have a .NET port of Triton(mostly the translator / semantics) here which has a standalone static version of Triton's IR translator(including support for symbolic memory with STORE/LOAD nodes). On the topic of #473, I wonder how hard it would be to try and update the rest of the Triton codebase to handle the addition of STORE/LOAD nodes.

SweetVishnya commented 1 year ago

@Colton1skees, I also did some experiments with symbolic pointers long time ago here.

JonathanSalwan commented 1 year ago

STORE/LOAD nodes are already implemented (#1185) and we have modes for this now:

Some examples here:

However, i'm pretty sure we can do better to improve how we deal with memory array :)

SweetVishnya commented 1 year ago

@Colton1skees, also writing C# bindings for Triton would be easier than copy-pasting the code.

Colton1skees commented 1 year ago

I also did some experiments with symbolic pointers long time ago here.

Neat

also writing C# bindings for Triton would be easier than copy-pasting the code.

If I linked a complete .NET port of Triton, then yes writing bindings would make sense. The repo I linked is basically just a standalone extraction of the IR translator, with some changes specific to my use case(e.g. never emitting reads/writes to concrete memory addresses, switching from capstone to ICED). Writings bindings wasn't necessary.