S2E / s2e-env

Your S2E project management tools. Visit https://s2e.systems/docs to get started.
Other
93 stars 51 forks source link

Problems about symbolic execute #147

Closed beraphin closed 6 years ago

beraphin commented 6 years ago

Hi, I wanna write a plugin to collect constraints of binary execution. The following code is what I used to test my plugin:

int main(){
        char c,b;
        scanf("%c%c",&c,&b);
        if(c=='A'){
                if(b=='B'){
                        puts("win!\n");
                        return 0;
                }
        }
        puts("fail!\n");
        return 0;
}

It's obvious that it will print "win" if we input "AB", otherwise it will print "fail" Then I handled the onTranslateBlockStart event to get the sequence basic blocks address of execution. And this how I start the program: echo "AB" | S2E_SYM_ARGS="" LD_PRELOAD=./s2e.so ./${TARGET} > /dev/null 2> /dev/null Then I got the sequence of basic blocks that lead to print "win" After that I modified my plugin to add constraints to input data:

state->constraints.addConstraint(E_EQ(expr[0], E_CONST("A", klee::Expr::Int8)));
state->constraints.addConstraint(E_EQ(expr[1], E_CONST("B", klee::Expr::Int8)));

And handled the onStateForkDecide to make sure it would not fork any new state. And handled the onTranslateBlockStart event to show the sequence of symbolic execution basic blocks address. After that I ran the program in symbolic mode: ./s2ecmd symbwrite 2 | S2E_SYM_ARGS="" LD_PRELOAD=./s2e.so ./${TARGET} > /dev/null 2> /dev/null But it's very strange that the trace of symbolic execution was not right:

4 [State 0] Tracer: basic block 0x400670 5 / 32
4 [State 0] Tracer: basic block 0x400480 6 / 32
4 [State 0] Tracer: basic block 0x400495 7 / 32
4 [State 0] Tracer: basic block 0x4006a1 8 / 32
4 [State 0] Tracer: basic block 0x4006a6 9 / 32
4 [State 0] Tracer: basic block 0x4005d0 10 / 32
4 [State 0] Tracer: basic block 0x4005db 11 / 32
4 [State 0] Tracer: basic block 0x400570 12 / 32
4 [State 0] Tracer: basic block 0x4005a8 13 / 32
4 [State 0] Tracer: basic block 0x4006bd 14 / 32
4 [State 0] Tracer: basic block 0x4006c6 15 / 32
4 [State 0] Tracer: basic block 0x4005f6 16 / 32
4 [State 0] Tracer: basic block 0x4004e0 17 / 32
4 [State 0] Tracer: basic block 0x4004e6 18 / 32
4 [State 0] Tracer: onStateForkDecide pc=0x7f5ab0cf4b35
4 [State 0] Tracer: onStateForkDecide pc=0x7f5ab0cda822
4 [State 0] Tracer: onStateForkDecide pc=0x7f5ab0cd5e9e
4 [State 0] Tracer: basic block 0x400627 19 / 32
4 [State 0] Tracer: There are differences between concrete and symbolic 20 / 32 0x400627 / 0x40062f
4 [State 0] Tracer: onStateForkDecide pc=0x40062d
4 [State 0] Tracer: There are differences between concrete and symbolic 21 / 32 0x400648 / 0x400637
4 [State 0] Tracer: There are differences between concrete and symbolic 22 / 32 0x400648 / 0x4004b0
4 [State 0] Tracer: There are differences between concrete and symbolic 23 / 32 0x4004b0 / 0x4004b6
4 [State 0] Tracer: There are differences between concrete and symbolic 24 / 32 0x4004b0 / 0x400641
4 [State 0] Tracer: There are differences between concrete and symbolic 25 / 32 0x4004b6 / 0x400657
4 [State 0] Tracer: There are differences between concrete and symbolic 26 / 32 0x4004b6 / 0x40066b
4 [State 0] Tracer: There are differences between concrete and symbolic 27 / 32 0x4004a0 / 0x4005b0
4 [State 0] Tracer: There are differences between concrete and symbolic 28 / 32 0x400652 / 0x4005b9
4 [State 0] Tracer: There are differences between concrete and symbolic 29 / 32 0x40066b / 0x400530
4 [State 0] Tracer: There are differences between concrete and symbolic 30 / 32 0x4005b0 / 0x400560
4 [State 0] Tracer: There are differences between concrete and symbolic 31 / 32 0x4005b9 / 0x4005c2
4 [State 0] Tracer: There are differences between concrete and symbolic 32 / 32 0x400530 / 0x4006e4
4 [State 0] Tracer: There are differences between concrete and symbolic 32 / 32 0x400560 / 0x4006e4
4 [State 0] Tracer: There are differences between concrete and symbolic 32 / 32 0x4005c2 / 0x4006e4
4 [State 0] Tracer: basic block 0x4006e4 32 / 32

According to this, some basic blocks were executed 2 times such as 0x400627 and 0x400648 And the trace of symbolic execution was not right, it lead to print "fail" rather than "win" The constraints of the state in the end of main function were:

(Eq false
     (Eq 0x0
         (And w64 (Add w64 0xffffffffffffffbf
                           (ZExt w64 (Extract w8 0 (Or w64 (Shl w64 (And w64 (ZExt w64 (Extract w8 0 (Or w64 (Shl w64 (And w64 (ZExt w64 (Extract w8 0 (ZExt w64 (Extract w32 0 (SExt w64 (Read w8 0x0 v0_buffer_0))))))
                                                                                                                               0xff)
                                                                                                                      0x0)
                                                                                                             0xffff88000f1cd000)))
                                                                             0xff)
                                                                    0x0)
                                                           0xffff88000f1cd000))))
                  0xff)))
(Eq false
     (Eq 0x0
         (And w64 (Add w64 0x1
                           (ZExt w64 (Extract w8 0 (Or w64 (Shl w64 (And w64 (ZExt w64 (Extract w8 0 (Or w64 (Shl w64 (And w64 (ZExt w64 (Extract w8 0 (ZExt w64 (Extract w32 0 (SExt w64 (Read w8 0x0 v0_buffer_0))))))
                                                                                                                               0xff)
                                                                                                                      0x0)
                                                                                                             0xffff88000f1cd000)))
                                                                             0xff)
                                                                    0x0)
                                                           0xffff88000f1cd000))))
                  0xffffffff)))
(Eq false
     (Eq 0x0
         (And w64 (Add w64 0x1
                           (ZExt w64 (Extract w8 0 (Or w64 (Shl w64 (And w64 (ZExt w64 (Extract w8 0 (Or w64 (Shl w64 (And w64 (ZExt w64 (Extract w8 0 (ZExt w64 (Extract w32 0 (SExt w64 (Read w8 0x0 v0_buffer_0))))))
                                                                                                                               0xff)
                                                                                                                      0x0)
                                                                                                             0xffff88000f1cd000)))
                                                                             0xff)
                                                                    0x0)
                                                           0xffff88000f1cd000))))
                  0xffffffff)))
(Eq false
     (Eq 0x0
         (And w64 (Add w64 0x1
                           (ZExt w64 (Extract w8 0 (Or w64 (Shl w64 (And w64 (ZExt w64 (Extract w8 0 (Or w64 (Shl w64 (And w64 (ZExt w64 (Extract w8 0 (ZExt w64 (Extract w32 0 (SExt w64 (Read w8 0x0 v0_buffer_0))))))
                                                                                                                               0xff)
                                                                                                                      0x0)
                                                                                                             0xffff88000f1cd000)))
                                                                             0xff)
                                                                    0x0)
                                                           0xffff88000f1cd000))))
                  0xffffffff)))
(Eq false
     (Eq 0x0
         (And w64 (Add w64 0x1
                           (ZExt w64 (Extract w8 0 (ZExt w64 (Extract w32 0 (SExt w64 (Read w8 0x0 v0_buffer_0)))))))
                  0xffffffff)))
(Eq 0x42 (Read w8 0x1 v0_buffer_0))
(Eq 0x41 (Read w8 0x0 v0_buffer_0))

I don't know why the constraints were so weird, but the constraints I added were exactly in there. So here are my problems: 1.Why that one basic block would be executed twice in symbolic execution? 2.Why that it would go to wrong path even I added constraints to the input data? A very long description. Thanks for help.

vitalych commented 6 years ago

What do you mean handling onTranslateBlockStart? This event is called at code translation and not at execution. This event may be called arbitrarily at any instructions depending on the circumstances. It may also be called multiple times for a given program counter. You need to connect an event to the execution signal passed as parameter to onTranslateBlockStart in order to be notified when the code is actually executed.

adrianherrera commented 6 years ago

I noticed that when you added constraints you used ”A”;. This should be ’A’; notice the single quotes.

I’m reading this on my phone so it’s hard for me to make the rest out :)

Sent from my iPhone

On 30 May 2018, at 11:02 pm, Vitaly Chipounov notifications@github.com wrote:

What do you mean handling onTranslateBlockStart? This event is called at code translation and not at execution. This event may be called arbitrarily at any instructions depending on the circumstances. It may also be called multiple times for a given program counter. You need to connect an event to the execution signal passed as parameter to onTranslateBlockStart in order to be notified when the code is actually executed.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

beraphin commented 6 years ago

@vitalych "You need to connect an event to the execution signal passed as parameter to onTranslateBlockStart in order to be notified when the code is actually executed." Do you mean the following code should be in initialize()? This is what I exactly did.

 s2e()->getCorePlugin()->onTranslateBlockStart.connect(
        sigc::mem_fun(*this, &Tracer::onTranslateBlockStart));
beraphin commented 6 years ago

@adrianherrera Ok in fact the codes in my plugin are:

for(int i=0;i<input_data.size();i++){
        stateinfo<<"adding constraint input["<<i<<"] == "<<input_data.c_str()[i]<<logend;
        state->constraints.addConstraint(E_EQ(expr[i], E_CONST(input_data.c_str()[i], klee::Expr::Int8)));
    }

So it was char type

vitalych commented 6 years ago

@beraphin doing what you did is not enough. You also need to do:

Tracer::onTranslateBlockStart(ExecutionSignal *signal...) {
  signal->connect(...).
}

As I've said, you only intercept translation, but you also need to intercept execution. You can have a look at the Example.cpp plugin to see how it's done.

beraphin commented 6 years ago

@vitalych You're right, it solved my first problem! Thanks for that! But my second problem is still there:

5 [State 0] Tracer: basic block 0x400670 5 / 34
5 [State 0] Tracer: basic block 0x400480 6 / 34
5 [State 0] Tracer: basic block 0x400495 7 / 34
5 [State 0] Tracer: basic block 0x4006a1 8 / 34
5 [State 0] Tracer: basic block 0x4006a6 9 / 34
5 [State 0] Tracer: basic block 0x4005d0 10 / 34
5 [State 0] Tracer: basic block 0x4005db 11 / 34
5 [State 0] Tracer: basic block 0x400570 12 / 34
5 [State 0] Tracer: basic block 0x4005a8 13 / 34
5 [State 0] Tracer: basic block 0x4006bd 14 / 34
5 [State 0] Tracer: basic block 0x4006c6 15 / 34
5 [State 0] Tracer: basic block 0x4005f6 16 / 34
5 [State 0] Tracer: basic block 0x4004e0 17 / 34
5 [State 0] Tracer: basic block 0x4004e6 18 / 34
5 [State 0] Tracer: basic block 0x4004a0 19 / 34
5 [State 0] Tracer: onStateForkDecide pc=0x7fa8dcc6eb35
5 [State 0] Tracer: onStateForkDecide pc=0x7fa8dcc54822
5 [State 0] Tracer: onStateForkDecide pc=0x7fa8dcc4fe9e
5 [State 0] Tracer: basic block 0x400627 20 / 34
5 [State 0] Tracer: onStateForkDecide pc=0x40062d
5 [State 0] Tracer: There are differences between concrete and symbolic 21 / 34 0x400648 / 0x40062f
5 [State 0] Tracer: There are differences between concrete and symbolic 22 / 34 0x4004b0 / 0x400637
5 [State 0] Tracer: There are differences between concrete and symbolic 23 / 34 0x4004b6 / 0x4004b0
5 [State 0] Tracer: There are differences between concrete and symbolic 24 / 34 0x4004a0 / 0x4004b6
5 [State 0] Tracer: There are differences between concrete and symbolic 25 / 34 0x400652 / 0x4004a0
5 [State 0] Tracer: There are differences between concrete and symbolic 26 / 34 0x40066b / 0x400641
5 [State 0] Tracer: There are differences between concrete and symbolic 27 / 34 0x4005b0 / 0x400657
5 [State 0] Tracer: There are differences between concrete and symbolic 28 / 34 0x4005b9 / 0x40066b
5 [State 0] Tracer: There are differences between concrete and symbolic 29 / 34 0x400530 / 0x4005b0
5 [State 0] Tracer: There are differences between concrete and symbolic 30 / 34 0x400560 / 0x4005b9
5 [State 0] Tracer: There are differences between concrete and symbolic 31 / 34 0x4005c2 / 0x400530
5 [State 0] Tracer: There are differences between concrete and symbolic 32 / 34 0x4006e4 / 0x400560

0x400648 is the block to print "fail" while the 0x40062F is the block to print "win" it went to the wrong path I think maybe the way I symbolic the input data was not right?

vitalych commented 6 years ago

S2E uses concolic execution, which means that each state has a cache of concrete values for every symbolic variable. When you add a constraint to a state, you much make sure that the constraint evaluates to true for the given concrete data in the given state. Otherwise you need to flush the cache.

So for your case, check if E_EQ(expr[i], E_CONST(input_data.c_str()[i], klee::Expr::Int8)) evaluates to true when you plug in the concrete data in the current state.

vitalych commented 6 years ago

It's better to use state->addConstraints(), which also has a check to validate the new constraints. https://github.com/S2E/libs2ecore/blob/master/src/S2EExecutionState.cpp#L844

beraphin commented 6 years ago

You're right that the constraints are false indeed.

for(iter=state->constraints.begin(),i=0;iter!=state->constraints.end();iter++,i++){
        (*iter)->dump();
        if(state->concolics->evaluate(*iter)->isTrue())
            logstart<<"TRUE!!!!!!!!!!!"<<logend;
        else
            logstart<<"FALSE!!!!!!!!!!"<<logend;
    }
(Eq 0x42 (Read w8 0x1 v0_buffer_0))
[TRACER] FALSE!!!!!!!!!!
(Eq 0x41 (Read w8 0x0 v0_buffer_0))
[TRACER] FALSE!!!!!!!!!!

But these constraints were added at the beginning of the execution, and there is no other constraint for input data before.

vitalych commented 6 years ago

There were no constraints, but concolic data was already present. You can disable concolic mode if you wish.

beraphin commented 6 years ago

Ok I may understand you. The real code to evaluate constraint is to use Solver. But in my plugin, I found out that the global variable DebugConstraints is false, so even I use state->addConstraints() it won't check constraint for me. How to turn DebugConstraints to true?

Ok just simply set it to true does work

beraphin commented 6 years ago

Thanks for your help, now things become more clear. After all the changes on source code of plugin, it gave that:

6 [State 0] Tracer: basic block 0x4004e6 18 / 34
6 [State 0] Tracer: basic block 0x4004a0 19 / 34
6 [State 0] Tracer: onStateForkDecide pc=0x7f4c5bef8b35 is true
6 [State 0] Tracer: onStateForkDecide pc=0x7f4c5bede822 is true
6 [State 0] Tracer: onStateForkDecide pc=0x7f4c5bed9e9e is true
6 [State 0] Tracer: basic block 0x400627 20 / 34
6 [State 0] Tracer: onStateForkDecide pc=0x40062d is true
State has invalid constraints
qemu: terminating on signal 15 from pid 7371
s2e-block: dirty sectors on close:648
Terminating node 0 (instance slot 0)

Even I deleted the code in onStateForkDecide to let s2e fork new state, s2e stopped running cause it found the constraints were not satisfiable while it wanted to go to wrong path. How to let s2e go to right path relying on its constraints?

beraphin commented 6 years ago

As you said, disable the concolic mode dose work.... Thanks again, s2e is a really complicated system. It would be nice to people like me if there are some specific API documents

beraphin commented 6 years ago

I noticed that once I provide the specific constraints, s2e will not collect any other constraints any more. For example:

int main(){
    char a,b;
    scanf("%c%c",&a,&b);
    if(a>'A'){
        if('b'<b&&b<'z'){
            puts("win!");
            return 0;
        }
    }
    puts("fail!");
    return 0;
}

I added the constraints "Zd" to input data to force s2e go to the "win" path, and all I want are the constraints like:

input[0]>'A'
input[1]>'b'
input[1]<'z'

but I found that in the end the constraints were:

input[0]=='Z'
input[1]=='d'

It seems like if the new constraint is wider than the old constraints, it won't be added, or will be simplified. So how can I get the constraints from binary? Is there any event like onAddConstraint?

vitalych commented 6 years ago

To set DebugConstraints to true, add --debug-constraints=true to the KLEE args section of the config file. https://github.com/S2E/libs2ecore/blob/master/src/S2EExecutor.cpp#L209

Re. complexity, yes we need more documentation, but it would be even better to fix the API so it's intuitive to use. So your feedback is valuable here.

KLEE may simplify constraints and discard those that are wider than existing ones. There is no onAddConstraintEvent. You can dump the state->constraints set when the state is terminaetd.

These look more like concrete value assignments rather than constraints:

input[0]=='Z'
input[1]=='d'
beraphin commented 6 years ago

Thanks for help, I think my problems have now been solved.