Open alexsetsove opened 1 year ago
You have to update the memory only if there is a desynchronization between your real CPU state (via IntelPin/Dynamorio) and the Triton's state
Take a look at the mem_read
and synch_regs
functions in http://shell-storm.org/repo/Notepad/qbdi_with_triton.txt
Thank you so much for your response @JonathanSalwan .
I did it and my path constraint is still empty. As soon as I enabled ONLY_ON_TAINTED, and the memory region containing my data was symbolized and tainted, so the path constraints were filled and the number of paths was less than when ONLY_ON_SYMBOLIZED disabled (I take that as a good sign). That's the right way to do it?
However, when I tried to solve branches, I got junk data, I expected the strings used in comparison.
However, when I tried to solve branches
How do you solve a branch?
If you want to solve a branch you have to take into account previous constraints too (what we call the path predicate). The constraint must be something like this getModel(previous_constraints && current_branch_constraint)
. You can take a look to this algorithm or simply do something like below using getPathPredicate()
that returns previous constraints of your current path:
ast = ctx.getAstContext()
pred = ctx.getPathPredicate() # returns the path predicate
crt = ast.land([pred, <your constraint that goes to your branch here>])
model = ctx.getModel(crt)
Here are the whole instructions for strstr in libc fom my tracer:
0x7fd22bcc420f movzx edx, byte ptr [rsi+0x01]
0x7fd22bcc4213 test dl, dl
0x7fd22bcc4215 jz 0x00007fd22bcc42d0
0x7fd22bcc421b movd xmm1, eax
0x7fd22bcc421f movd xmm2, edx
0x7fd22bcc4223 mov rax, rdi
0x7fd22bcc4226 and eax, 0x00000fff
0x7fd22bcc422b punpcklbw xmm1, xmm1
0x7fd22bcc422f cmp rax, 0x00000fbf
0x7fd22bcc4235 punpcklbw xmm2, xmm2
0x7fd22bcc4239 punpcklwd xmm1, xmm1
0x7fd22bcc423d punpcklwd xmm2, xmm2
0x7fd22bcc4241 pshufd xmm1, xmm1, 0x00
0x7fd22bcc4246 pshufd xmm2, xmm2, 0x00
0x7fd22bcc424b jnbe 0x00007fd22bcc4550
0x7fd22bcc4251 movdqu xmm3, oword ptr [rdi]
0x7fd22bcc4255 pxor xmm5, xmm5
0x7fd22bcc4259 movdqu xmm4, oword ptr [rdi+0x01]
0x7fd22bcc425e movdqa xmm6, xmm3
0x7fd22bcc4262 pcmpeqb xmm3, xmm1
0x7fd22bcc4266 pcmpeqb xmm4, xmm2
0x7fd22bcc426a movdqu xmm0, oword ptr [rdi+0x10]
0x7fd22bcc426f pcmpeqb xmm6, xmm5
0x7fd22bcc4273 pminub xmm3, xmm4
0x7fd22bcc4277 movdqa xmm4, xmm3
0x7fd22bcc427b movdqu xmm3, oword ptr [rdi+0x11]
0x7fd22bcc4280 pcmpeqb xmm5, xmm0
0x7fd22bcc4284 pcmpeqb xmm3, xmm2
0x7fd22bcc4288 por xmm4, xmm6
0x7fd22bcc428c pcmpeqb xmm0, xmm1
0x7fd22bcc4290 pminub xmm0, xmm3
0x7fd22bcc4294 por xmm0, xmm5
0x7fd22bcc4298 pmovmskb r8d, xmm4
0x7fd22bcc429d pmovmskb eax, xmm0
0x7fd22bcc42a1 shl rax, 0x10
0x7fd22bcc42a5 or r8, rax
0x7fd22bcc42a8 jz 0x00007fd22bcc4310
0x7fd22bcc42aa bsf rax, r8
0x7fd22bcc42ae add rax, rdi
0x7fd22bcc42b1 cmp byte ptr [rax], 0x00
0x7fd22bcc42b4 jz 0x00007fd22bcc42f4
0x7fd22bcc42f4 xor eax, eax
0x7fd22bcc42f6 ret
This is the first constraint from ctx.getPathConstraints(), which is MultipleBranches:
13:56:07 [DEBUG ] branch : isTaken : False, srcAddr : 0x7fd22bcc42a8, dstAddr : 0x7fd22bcc4310
But comparison will be done via XMM registers. What should I do?
Another thing, is it okay that I got an exception when building the semantic (ctx.buildSemantics(inst))? Triton can disassemble the opcode, as you can see.
13:56:59 [DEBUG ] TR => 0x7f6b38da11b3 C5F877 vzeroupper
13:56:59 [CRITICAL] problem FAULT_UD : C5F877 vzeroupper
But comparison will be done via XMM registers. What should I do?
I'm not sure to get this point. If you receive a model from the solver, you just have to inject the model into symbolic variables that you defined at the beginning (probably in memory cells).
Another thing, is it okay that I got an exception when building the semantic (ctx.buildSemantics(inst))? Triton can disassemble the opcode, as you can see.
If you receive a FAULT_UD
, it means that we do not support the semantic of the given opcode. Yep, we can disassemble it because we are using Capstone that supports it (disassembly is done by Capstone and semantics by Triton).
Here is the list of supported x86_64 instructions :).
I changed the source to only do this:
input="junk data"
if(strstr(input,"empty string")!=NULL){
cnt++;
}
Backward slice for the first 4 bytes of memory looks like this:
{1: (define-fun ref!1 () (_ BitVec 8) input_0[0]) ; Byte reference}
{3: (define-fun ref!3 () (_ BitVec 8) input_0[1]) ; Byte reference}
{5: (define-fun ref!5 () (_ BitVec 8) input_0[2]) ; Byte reference}
{7: (define-fun ref!7 () (_ BitVec 8) input_0[3]) ; Byte reference}
P.S: If you've gotten emails from Github for editing messages, I'm sorry. I got confused and don't know what I did wrong!
In order to increase the code coverage, I want to solve the unsolved branches. Sample target does only string comparison (strcmp) .
My tarcer was IntelPin/Dynamorio, I sent execution information to Triton like instructions and registers, and updated memory via Triton callbacks. Update will lose path constraints, like issue #1128 . In this situation, I don't know what to do. I'm going to send and want to solve some specific execution regions.
Any leads like pseudo code or workflow that can clarify the path are appreciated.